linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH next v1 0/3] printk: remove safe buffers
@ 2021-03-16 23:33 John Ogness
  2021-03-16 23:33 ` [PATCH next v1 1/3] printk: track/limit recursion John Ogness
                   ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: John Ogness @ 2021-03-16 23:33 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Steven Rostedt,
	Thomas Gleixner, linux-kernel, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, Eric Biederman,
	Nicholas Piggin, Christophe Leroy, Alistair Popple,
	Jordan Niethe, Peter Zijlstra, Cédric Le Goater,
	Andrew Morton, Kees Cook, Yue Hu, Alexey Kardashevskiy,
	Rafael Aquini, Tiezhu Yang, Guilherme G. Piccoli,
	Paul E. McKenney, linuxppc-dev, kexec

Hello,

Here is v1 of a series to remove the safe buffers. They are no
longer needed because messages can be stored directly into the
log buffer from any context.

However, the safe buffers also provided a form of recursion
protection. For that reason, explicit recursion protection is
also implemented for this series.

This series falls in line with the printk-rework plan as
presented [0] at Linux Plumbers in Lisbon 2019.

This series is based on next-20210316.

John Ogness

[0] https://linuxplumbersconf.org/event/4/contributions/290/attachments/276/463/lpc2019_jogness_printk.pdf (slide 23)

John Ogness (3):
  printk: track/limit recursion
  printk: remove safe buffers
  printk: convert @syslog_lock to spin_lock

 arch/powerpc/kernel/traps.c    |   1 -
 arch/powerpc/kernel/watchdog.c |   5 -
 include/linux/printk.h         |  10 -
 kernel/kexec_core.c            |   1 -
 kernel/panic.c                 |   3 -
 kernel/printk/internal.h       |   2 -
 kernel/printk/printk.c         | 171 +++++++++--------
 kernel/printk/printk_safe.c    | 332 +--------------------------------
 lib/nmi_backtrace.c            |   6 -
 9 files changed, 100 insertions(+), 431 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH next v1 1/3] printk: track/limit recursion
  2021-03-16 23:33 [PATCH next v1 0/3] printk: remove safe buffers John Ogness
@ 2021-03-16 23:33 ` John Ogness
  2021-03-21  5:34   ` Sergey Senozhatsky
  2021-03-22 14:49   ` Petr Mladek
  2021-03-16 23:33 ` [PATCH next v1 2/3] printk: remove safe buffers John Ogness
  2021-03-16 23:33 ` [PATCH next v1 3/3] printk: convert @syslog_lock to spin_lock John Ogness
  2 siblings, 2 replies; 23+ messages in thread
From: John Ogness @ 2021-03-16 23:33 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Steven Rostedt,
	Thomas Gleixner, linux-kernel

Track printk() recursion and limit it to 3 levels per-CPU and per-context.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
 kernel/printk/printk.c | 80 ++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 77 insertions(+), 3 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 2f829fbf0a13..c666e3e43f0c 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1940,6 +1940,71 @@ static void call_console_drivers(const char *ext_text, size_t ext_len,
 	}
 }
 
+/*
+ * Recursion is tracked separately on each CPU. If NMIs are supported, an
+ * additional NMI context per CPU is also separately tracked. Until per-CPU
+ * is available, a separate "early tracking" is performed.
+ */
+#ifdef CONFIG_PRINTK_NMI
+#define PRINTK_CTX_NUM 2
+#else
+#define PRINTK_CTX_NUM 1
+#endif
+static DEFINE_PER_CPU(char [PRINTK_CTX_NUM], printk_count);
+static char printk_count_early[PRINTK_CTX_NUM];
+
+/*
+ * Recursion is limited to keep the output sane. printk() should not require
+ * more than 1 level of recursion (allowing, for example, printk() to trigger
+ * a WARN), but a higher value is used in case some printk-internal errors
+ * exist, such as the ringbuffer validation checks failing.
+ */
+#define PRINTK_MAX_RECURSION 3
+
+/* Return a pointer to the dedicated counter for the CPU+context of the caller. */
+static char *printk_recursion_counter(void)
+{
+	int ctx = 0;
+
+#ifdef CONFIG_PRINTK_NMI
+	if (in_nmi())
+		ctx = 1;
+#endif
+	if (!printk_percpu_data_ready())
+		return &printk_count_early[ctx];
+	return &((*this_cpu_ptr(&printk_count))[ctx]);
+}
+
+/*
+ * Enter recursion tracking. Interrupts are disabled to simplify tracking.
+ * The caller must check the return value to see if the recursion is allowed.
+ * On failure, interrupts are not disabled.
+ */
+static bool printk_enter_irqsave(unsigned long *flags)
+{
+	char *count;
+
+	local_irq_save(*flags);
+	count = printk_recursion_counter();
+	if (*count > PRINTK_MAX_RECURSION) {
+		local_irq_restore(*flags);
+		return false;
+	}
+	(*count)++;
+
+	return true;
+}
+
+/* Exit recursion tracking, restoring interrupts. */
+static void printk_exit_irqrestore(unsigned long flags)
+{
+	char *count;
+
+	count = printk_recursion_counter();
+	(*count)--;
+	local_irq_restore(flags);
+}
+
 int printk_delay_msec __read_mostly;
 
 static inline void printk_delay(void)
@@ -2040,11 +2105,13 @@ int vprintk_store(int facility, int level,
 	struct prb_reserved_entry e;
 	enum log_flags lflags = 0;
 	struct printk_record r;
+	unsigned long irqflags;
 	u16 trunc_msg_len = 0;
 	char prefix_buf[8];
 	u16 reserve_size;
 	va_list args2;
 	u16 text_len;
+	int ret = 0;
 	u64 ts_nsec;
 
 	/*
@@ -2055,6 +2122,9 @@ int vprintk_store(int facility, int level,
 	 */
 	ts_nsec = local_clock();
 
+	if (!printk_enter_irqsave(&irqflags))
+		return 0;
+
 	/*
 	 * The sprintf needs to come first since the syslog prefix might be
 	 * passed in as a parameter. An extra byte must be reserved so that
@@ -2092,7 +2162,8 @@ int vprintk_store(int facility, int level,
 				prb_commit(&e);
 			}
 
-			return text_len;
+			ret = text_len;
+			goto out;
 		}
 	}
 
@@ -2108,7 +2179,7 @@ int vprintk_store(int facility, int level,
 
 		prb_rec_init_wr(&r, reserve_size + trunc_msg_len);
 		if (!prb_reserve(&e, prb, &r))
-			return 0;
+			goto out;
 	}
 
 	/* fill message */
@@ -2130,7 +2201,10 @@ int vprintk_store(int facility, int level,
 	else
 		prb_final_commit(&e);
 
-	return (text_len + trunc_msg_len);
+	ret = text_len + trunc_msg_len;
+out:
+	printk_exit_irqrestore(irqflags);
+	return ret;
 }
 
 asmlinkage int vprintk_emit(int facility, int level,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH next v1 2/3] printk: remove safe buffers
  2021-03-16 23:33 [PATCH next v1 0/3] printk: remove safe buffers John Ogness
  2021-03-16 23:33 ` [PATCH next v1 1/3] printk: track/limit recursion John Ogness
@ 2021-03-16 23:33 ` John Ogness
  2021-03-21  5:26   ` Sergey Senozhatsky
  2021-03-23 10:47   ` Petr Mladek
  2021-03-16 23:33 ` [PATCH next v1 3/3] printk: convert @syslog_lock to spin_lock John Ogness
  2 siblings, 2 replies; 23+ messages in thread
From: John Ogness @ 2021-03-16 23:33 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Steven Rostedt,
	Thomas Gleixner, linux-kernel, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, Eric Biederman,
	Nicholas Piggin, Christophe Leroy, Alistair Popple,
	Jordan Niethe, Peter Zijlstra, Cédric Le Goater,
	Andrew Morton, Kees Cook, Yue Hu, Alexey Kardashevskiy,
	Rafael Aquini, Tiezhu Yang, Guilherme G. Piccoli,
	Paul E. McKenney, linuxppc-dev, kexec

With @logbuf_lock removed, the high level printk functions for
storing messages are lockless. Messages can be stored from any
context, so there is no need for the NMI and safe buffers anymore.
Remove the NMI and safe buffers.

Although the safe buffers are removed, the NMI and safe context
tracking is still in place. In these contexts, store the message
immediately but still use irq_work to defer the console printing.

Since printk recursion tracking is in place, safe context tracking
for most of printk is not needed. Remove it. Only safe context
tracking relating to the console lock is left in place. This is
because the console lock is needed for the actual printing.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
 arch/powerpc/kernel/traps.c    |   1 -
 arch/powerpc/kernel/watchdog.c |   5 -
 include/linux/printk.h         |  10 -
 kernel/kexec_core.c            |   1 -
 kernel/panic.c                 |   3 -
 kernel/printk/internal.h       |   2 -
 kernel/printk/printk.c         |  81 ++------
 kernel/printk/printk_safe.c    | 332 +--------------------------------
 lib/nmi_backtrace.c            |   6 -
 9 files changed, 18 insertions(+), 423 deletions(-)

diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index a44a30b0688c..5828c83eaca6 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -171,7 +171,6 @@ extern void panic_flush_kmsg_start(void)
 
 extern void panic_flush_kmsg_end(void)
 {
-	printk_safe_flush_on_panic();
 	kmsg_dump(KMSG_DUMP_PANIC);
 	bust_spinlocks(0);
 	debug_locks_off();
diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
index c9a8f4781a10..dc17d8903d4f 100644
--- a/arch/powerpc/kernel/watchdog.c
+++ b/arch/powerpc/kernel/watchdog.c
@@ -183,11 +183,6 @@ static void watchdog_smp_panic(int cpu, u64 tb)
 
 	wd_smp_unlock(&flags);
 
-	printk_safe_flush();
-	/*
-	 * printk_safe_flush() seems to require another print
-	 * before anything actually goes out to console.
-	 */
 	if (sysctl_hardlockup_all_cpu_backtrace)
 		trigger_allbutself_cpu_backtrace();
 
diff --git a/include/linux/printk.h b/include/linux/printk.h
index fe7eb2351610..2476796c1150 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -207,8 +207,6 @@ __printf(1, 2) void dump_stack_set_arch_desc(const char *fmt, ...);
 void dump_stack_print_info(const char *log_lvl);
 void show_regs_print_info(const char *log_lvl);
 extern asmlinkage void dump_stack(void) __cold;
-extern void printk_safe_flush(void);
-extern void printk_safe_flush_on_panic(void);
 #else
 static inline __printf(1, 0)
 int vprintk(const char *s, va_list args)
@@ -272,14 +270,6 @@ static inline void show_regs_print_info(const char *log_lvl)
 static inline void dump_stack(void)
 {
 }
-
-static inline void printk_safe_flush(void)
-{
-}
-
-static inline void printk_safe_flush_on_panic(void)
-{
-}
 #endif
 
 extern int kptr_restrict;
diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index f04d04d1b855..64bf5d5cdd06 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -977,7 +977,6 @@ void crash_kexec(struct pt_regs *regs)
 	old_cpu = atomic_cmpxchg(&panic_cpu, PANIC_CPU_INVALID, this_cpu);
 	if (old_cpu == PANIC_CPU_INVALID) {
 		/* This is the 1st CPU which comes here, so go ahead. */
-		printk_safe_flush_on_panic();
 		__crash_kexec(regs);
 
 		/*
diff --git a/kernel/panic.c b/kernel/panic.c
index 332736a72a58..1f0df42f8d0c 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -247,7 +247,6 @@ void panic(const char *fmt, ...)
 	 * Bypass the panic_cpu check and call __crash_kexec directly.
 	 */
 	if (!_crash_kexec_post_notifiers) {
-		printk_safe_flush_on_panic();
 		__crash_kexec(NULL);
 
 		/*
@@ -271,8 +270,6 @@ void panic(const char *fmt, ...)
 	 */
 	atomic_notifier_call_chain(&panic_notifier_list, 0, buf);
 
-	/* Call flush even twice. It tries harder with a single online CPU */
-	printk_safe_flush_on_panic();
 	kmsg_dump(KMSG_DUMP_PANIC);
 
 	/*
diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h
index e7acc2888c8e..e108b2ece8c7 100644
--- a/kernel/printk/internal.h
+++ b/kernel/printk/internal.h
@@ -23,7 +23,6 @@ __printf(1, 0) int vprintk_func(const char *fmt, va_list args);
 void __printk_safe_enter(void);
 void __printk_safe_exit(void);
 
-void printk_safe_init(void);
 bool printk_percpu_data_ready(void);
 
 #define printk_safe_enter_irqsave(flags)	\
@@ -67,6 +66,5 @@ __printf(1, 0) int vprintk_func(const char *fmt, va_list args) { return 0; }
 #define printk_safe_enter_irq() local_irq_disable()
 #define printk_safe_exit_irq() local_irq_enable()
 
-static inline void printk_safe_init(void) { }
 static inline bool printk_percpu_data_ready(void) { return false; }
 #endif /* CONFIG_PRINTK */
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index c666e3e43f0c..fa52a5daa232 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -732,27 +732,22 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf,
 	if (ret)
 		return ret;
 
-	printk_safe_enter_irq();
 	if (!prb_read_valid(prb, atomic64_read(&user->seq), r)) {
 		if (file->f_flags & O_NONBLOCK) {
 			ret = -EAGAIN;
-			printk_safe_exit_irq();
 			goto out;
 		}
 
-		printk_safe_exit_irq();
 		ret = wait_event_interruptible(log_wait,
 				prb_read_valid(prb, atomic64_read(&user->seq), r));
 		if (ret)
 			goto out;
-		printk_safe_enter_irq();
 	}
 
 	if (r->info->seq != atomic64_read(&user->seq)) {
 		/* our last seen message is gone, return error and reset */
 		atomic64_set(&user->seq, r->info->seq);
 		ret = -EPIPE;
-		printk_safe_exit_irq();
 		goto out;
 	}
 
@@ -762,7 +757,6 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf,
 				  &r->info->dev_info);
 
 	atomic64_set(&user->seq, r->info->seq + 1);
-	printk_safe_exit_irq();
 
 	if (len > count) {
 		ret = -EINVAL;
@@ -797,7 +791,6 @@ static loff_t devkmsg_llseek(struct file *file, loff_t offset, int whence)
 	if (offset)
 		return -ESPIPE;
 
-	printk_safe_enter_irq();
 	switch (whence) {
 	case SEEK_SET:
 		/* the first record */
@@ -818,7 +811,6 @@ static loff_t devkmsg_llseek(struct file *file, loff_t offset, int whence)
 	default:
 		ret = -EINVAL;
 	}
-	printk_safe_exit_irq();
 	return ret;
 }
 
@@ -833,7 +825,6 @@ static __poll_t devkmsg_poll(struct file *file, poll_table *wait)
 
 	poll_wait(file, &log_wait, wait);
 
-	printk_safe_enter_irq();
 	if (prb_read_valid_info(prb, atomic64_read(&user->seq), &info, NULL)) {
 		/* return error when data has vanished underneath us */
 		if (info.seq != atomic64_read(&user->seq))
@@ -841,7 +832,6 @@ static __poll_t devkmsg_poll(struct file *file, poll_table *wait)
 		else
 			ret = EPOLLIN|EPOLLRDNORM;
 	}
-	printk_safe_exit_irq();
 
 	return ret;
 }
@@ -874,9 +864,7 @@ static int devkmsg_open(struct inode *inode, struct file *file)
 	prb_rec_init_rd(&user->record, &user->info,
 			&user->text_buf[0], sizeof(user->text_buf));
 
-	printk_safe_enter_irq();
 	atomic64_set(&user->seq, prb_first_valid_seq(prb));
-	printk_safe_exit_irq();
 
 	file->private_data = user;
 	return 0;
@@ -1042,9 +1030,6 @@ static inline void log_buf_add_cpu(void) {}
 
 static void __init set_percpu_data_ready(void)
 {
-	printk_safe_init();
-	/* Make sure we set this flag only after printk_safe() init is done */
-	barrier();
 	__printk_percpu_data_ready = true;
 }
 
@@ -1084,7 +1069,6 @@ void __init setup_log_buf(int early)
 	struct printk_record r;
 	size_t new_descs_size;
 	size_t new_infos_size;
-	unsigned long flags;
 	char *new_log_buf;
 	unsigned int free;
 	u64 seq;
@@ -1142,8 +1126,6 @@ void __init setup_log_buf(int early)
 		 new_descs, ilog2(new_descs_count),
 		 new_infos);
 
-	printk_safe_enter_irqsave(flags);
-
 	log_buf_len = new_log_buf_len;
 	log_buf = new_log_buf;
 	new_log_buf_len = 0;
@@ -1159,8 +1141,6 @@ void __init setup_log_buf(int early)
 	 */
 	prb = &printk_rb_dynamic;
 
-	printk_safe_exit_irqrestore(flags);
-
 	if (seq != prb_next_seq(&printk_rb_static)) {
 		pr_err("dropped %llu messages\n",
 		       prb_next_seq(&printk_rb_static) - seq);
@@ -1498,11 +1478,9 @@ static int syslog_print(char __user *buf, int size)
 		size_t n;
 		size_t skip;
 
-		printk_safe_enter_irq();
-		raw_spin_lock(&syslog_lock);
+		raw_spin_lock_irq(&syslog_lock);
 		if (!prb_read_valid(prb, syslog_seq, &r)) {
-			raw_spin_unlock(&syslog_lock);
-			printk_safe_exit_irq();
+			raw_spin_unlock_irq(&syslog_lock);
 			break;
 		}
 		if (r.info->seq != syslog_seq) {
@@ -1531,8 +1509,7 @@ static int syslog_print(char __user *buf, int size)
 			syslog_partial += n;
 		} else
 			n = 0;
-		raw_spin_unlock(&syslog_lock);
-		printk_safe_exit_irq();
+		raw_spin_unlock_irq(&syslog_lock);
 
 		if (!n)
 			break;
@@ -1566,7 +1543,6 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
 		return -ENOMEM;
 
 	time = printk_time;
-	printk_safe_enter_irq();
 	/*
 	 * Find first record that fits, including all following records,
 	 * into the user-provided buffer for this dump.
@@ -1587,23 +1563,20 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
 			break;
 		}
 
-		printk_safe_exit_irq();
 		if (copy_to_user(buf + len, text, textlen))
 			len = -EFAULT;
 		else
 			len += textlen;
-		printk_safe_enter_irq();
 
 		if (len < 0)
 			break;
 	}
 
 	if (clear) {
-		raw_spin_lock(&syslog_lock);
+		raw_spin_lock_irq(&syslog_lock);
 		latched_seq_write(&clear_seq, seq);
-		raw_spin_unlock(&syslog_lock);
+		raw_spin_unlock_irq(&syslog_lock);
 	}
-	printk_safe_exit_irq();
 
 	kfree(text);
 	return len;
@@ -1611,11 +1584,9 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
 
 static void syslog_clear(void)
 {
-	printk_safe_enter_irq();
-	raw_spin_lock(&syslog_lock);
+	raw_spin_lock_irq(&syslog_lock);
 	latched_seq_write(&clear_seq, prb_next_seq(prb));
-	raw_spin_unlock(&syslog_lock);
-	printk_safe_exit_irq();
+	raw_spin_unlock_irq(&syslog_lock);
 }
 
 /* Return a consistent copy of @syslog_seq. */
@@ -1703,12 +1674,10 @@ int do_syslog(int type, char __user *buf, int len, int source)
 		break;
 	/* Number of chars in the log buffer */
 	case SYSLOG_ACTION_SIZE_UNREAD:
-		printk_safe_enter_irq();
-		raw_spin_lock(&syslog_lock);
+		raw_spin_lock_irq(&syslog_lock);
 		if (!prb_read_valid_info(prb, syslog_seq, &info, NULL)) {
 			/* No unread messages. */
-			raw_spin_unlock(&syslog_lock);
-			printk_safe_exit_irq();
+			raw_spin_unlock_irq(&syslog_lock);
 			return 0;
 		}
 		if (info.seq != syslog_seq) {
@@ -1736,8 +1705,7 @@ int do_syslog(int type, char __user *buf, int len, int source)
 			}
 			error -= syslog_partial;
 		}
-		raw_spin_unlock(&syslog_lock);
-		printk_safe_exit_irq();
+		raw_spin_unlock_irq(&syslog_lock);
 		break;
 	/* Size of the log buffer */
 	case SYSLOG_ACTION_SIZE_BUFFER:
@@ -2213,7 +2181,6 @@ asmlinkage int vprintk_emit(int facility, int level,
 {
 	int printed_len;
 	bool in_sched = false;
-	unsigned long flags;
 
 	/* Suppress unimportant messages after panic happens */
 	if (unlikely(suppress_printk))
@@ -2227,9 +2194,7 @@ asmlinkage int vprintk_emit(int facility, int level,
 	boot_delay_msec(level);
 	printk_delay();
 
-	printk_safe_enter_irqsave(flags);
 	printed_len = vprintk_store(facility, level, dev_info, fmt, args);
-	printk_safe_exit_irqrestore(flags);
 
 	/* If called from the scheduler, we can not call up(). */
 	if (!in_sched) {
@@ -2666,7 +2631,6 @@ void console_unlock(void)
 		size_t ext_len = 0;
 		size_t len;
 
-		printk_safe_enter_irqsave(flags);
 skip:
 		if (!prb_read_valid(prb, console_seq, &r))
 			break;
@@ -2711,6 +2675,8 @@ void console_unlock(void)
 				printk_time);
 		console_seq++;
 
+		printk_safe_enter_irqsave(flags);
+
 		/*
 		 * While actively printing out messages, if another printk()
 		 * were to occur on another CPU, it may wait for this one to
@@ -2745,8 +2711,6 @@ void console_unlock(void)
 	 * flush, no worries.
 	 */
 	retry = prb_read_valid(prb, console_seq, NULL);
-	printk_safe_exit_irqrestore(flags);
-
 	if (retry && console_trylock())
 		goto again;
 }
@@ -2808,13 +2772,8 @@ void console_flush_on_panic(enum con_flush_mode mode)
 	console_trylock();
 	console_may_schedule = 0;
 
-	if (mode == CONSOLE_REPLAY_ALL) {
-		unsigned long flags;
-
-		printk_safe_enter_irqsave(flags);
+	if (mode == CONSOLE_REPLAY_ALL)
 		console_seq = prb_first_valid_seq(prb);
-		printk_safe_exit_irqrestore(flags);
-	}
 	console_unlock();
 }
 
@@ -3466,14 +3425,12 @@ bool kmsg_dump_get_line(struct kmsg_dump_iter *iter, bool syslog,
 	struct printk_info info;
 	unsigned int line_count;
 	struct printk_record r;
-	unsigned long flags;
 	size_t l = 0;
 	bool ret = false;
 
 	if (iter->cur_seq < min_seq)
 		iter->cur_seq = min_seq;
 
-	printk_safe_enter_irqsave(flags);
 	prb_rec_init_rd(&r, &info, line, size);
 
 	/* Read text or count text lines? */
@@ -3494,7 +3451,6 @@ bool kmsg_dump_get_line(struct kmsg_dump_iter *iter, bool syslog,
 	iter->cur_seq = r.info->seq + 1;
 	ret = true;
 out:
-	printk_safe_exit_irqrestore(flags);
 	if (len)
 		*len = l;
 	return ret;
@@ -3526,7 +3482,6 @@ bool kmsg_dump_get_buffer(struct kmsg_dump_iter *iter, bool syslog,
 	u64 min_seq = latched_seq_read_nolock(&clear_seq);
 	struct printk_info info;
 	struct printk_record r;
-	unsigned long flags;
 	u64 seq;
 	u64 next_seq;
 	size_t len = 0;
@@ -3539,7 +3494,6 @@ bool kmsg_dump_get_buffer(struct kmsg_dump_iter *iter, bool syslog,
 	if (iter->cur_seq < min_seq)
 		iter->cur_seq = min_seq;
 
-	printk_safe_enter_irqsave(flags);
 	if (prb_read_valid_info(prb, iter->cur_seq, &info, NULL)) {
 		if (info.seq != iter->cur_seq) {
 			/* messages are gone, move to first available one */
@@ -3548,10 +3502,8 @@ bool kmsg_dump_get_buffer(struct kmsg_dump_iter *iter, bool syslog,
 	}
 
 	/* last entry */
-	if (iter->cur_seq >= iter->next_seq) {
-		printk_safe_exit_irqrestore(flags);
+	if (iter->cur_seq >= iter->next_seq)
 		goto out;
-	}
 
 	/*
 	 * Find first record that fits, including all following records,
@@ -3583,7 +3535,6 @@ bool kmsg_dump_get_buffer(struct kmsg_dump_iter *iter, bool syslog,
 
 	iter->next_seq = next_seq;
 	ret = true;
-	printk_safe_exit_irqrestore(flags);
 out:
 	if (len_out)
 		*len_out = len;
@@ -3601,12 +3552,8 @@ EXPORT_SYMBOL_GPL(kmsg_dump_get_buffer);
  */
 void kmsg_dump_rewind(struct kmsg_dump_iter *iter)
 {
-	unsigned long flags;
-
-	printk_safe_enter_irqsave(flags);
 	iter->cur_seq = latched_seq_read_nolock(&clear_seq);
 	iter->next_seq = prb_next_seq(prb);
-	printk_safe_exit_irqrestore(flags);
 }
 EXPORT_SYMBOL_GPL(kmsg_dump_rewind);
 
diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
index bbf5c1993636..ccebce4d5207 100644
--- a/kernel/printk/printk_safe.c
+++ b/kernel/printk/printk_safe.c
@@ -15,286 +15,9 @@
 
 #include "internal.h"
 
-/*
- * In NMI and safe mode, printk() avoids taking locks. Instead,
- * it uses an alternative implementation that temporary stores
- * the strings into a per-CPU buffer. The content of the buffer
- * is later flushed into the main ring buffer via IRQ work.
- *
- * The alternative implementation is chosen transparently
- * by examining current printk() context mask stored in @printk_context
- * per-CPU variable.
- *
- * The implementation allows to flush the strings also from another CPU.
- * There are situations when we want to make sure that all buffers
- * were handled or when IRQs are blocked.
- */
-
-#define SAFE_LOG_BUF_LEN ((1 << CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT) -	\
-				sizeof(atomic_t) -			\
-				sizeof(atomic_t) -			\
-				sizeof(struct irq_work))
-
-struct printk_safe_seq_buf {
-	atomic_t		len;	/* length of written data */
-	atomic_t		message_lost;
-	struct irq_work		work;	/* IRQ work that flushes the buffer */
-	unsigned char		buffer[SAFE_LOG_BUF_LEN];
-};
-
-static DEFINE_PER_CPU(struct printk_safe_seq_buf, safe_print_seq);
 static DEFINE_PER_CPU(int, printk_context);
 
-static DEFINE_RAW_SPINLOCK(safe_read_lock);
-
-#ifdef CONFIG_PRINTK_NMI
-static DEFINE_PER_CPU(struct printk_safe_seq_buf, nmi_print_seq);
-#endif
-
-/* Get flushed in a more safe context. */
-static void queue_flush_work(struct printk_safe_seq_buf *s)
-{
-	if (printk_percpu_data_ready())
-		irq_work_queue(&s->work);
-}
-
-/*
- * Add a message to per-CPU context-dependent buffer. NMI and printk-safe
- * have dedicated buffers, because otherwise printk-safe preempted by
- * NMI-printk would have overwritten the NMI messages.
- *
- * The messages are flushed from irq work (or from panic()), possibly,
- * from other CPU, concurrently with printk_safe_log_store(). Should this
- * happen, printk_safe_log_store() will notice the buffer->len mismatch
- * and repeat the write.
- */
-static __printf(2, 0) int printk_safe_log_store(struct printk_safe_seq_buf *s,
-						const char *fmt, va_list args)
-{
-	int add;
-	size_t len;
-	va_list ap;
-
-again:
-	len = atomic_read(&s->len);
-
-	/* The trailing '\0' is not counted into len. */
-	if (len >= sizeof(s->buffer) - 1) {
-		atomic_inc(&s->message_lost);
-		queue_flush_work(s);
-		return 0;
-	}
-
-	/*
-	 * Make sure that all old data have been read before the buffer
-	 * was reset. This is not needed when we just append data.
-	 */
-	if (!len)
-		smp_rmb();
-
-	va_copy(ap, args);
-	add = vscnprintf(s->buffer + len, sizeof(s->buffer) - len, fmt, ap);
-	va_end(ap);
-	if (!add)
-		return 0;
-
-	/*
-	 * Do it once again if the buffer has been flushed in the meantime.
-	 * Note that atomic_cmpxchg() is an implicit memory barrier that
-	 * makes sure that the data were written before updating s->len.
-	 */
-	if (atomic_cmpxchg(&s->len, len, len + add) != len)
-		goto again;
-
-	queue_flush_work(s);
-	return add;
-}
-
-static inline void printk_safe_flush_line(const char *text, int len)
-{
-	/*
-	 * Avoid any console drivers calls from here, because we may be
-	 * in NMI or printk_safe context (when in panic). The messages
-	 * must go only into the ring buffer at this stage.  Consoles will
-	 * get explicitly called later when a crashdump is not generated.
-	 */
-	printk_deferred("%.*s", len, text);
-}
-
-/* printk part of the temporary buffer line by line */
-static int printk_safe_flush_buffer(const char *start, size_t len)
-{
-	const char *c, *end;
-	bool header;
-
-	c = start;
-	end = start + len;
-	header = true;
-
-	/* Print line by line. */
-	while (c < end) {
-		if (*c == '\n') {
-			printk_safe_flush_line(start, c - start + 1);
-			start = ++c;
-			header = true;
-			continue;
-		}
-
-		/* Handle continuous lines or missing new line. */
-		if ((c + 1 < end) && printk_get_level(c)) {
-			if (header) {
-				c = printk_skip_level(c);
-				continue;
-			}
-
-			printk_safe_flush_line(start, c - start);
-			start = c++;
-			header = true;
-			continue;
-		}
-
-		header = false;
-		c++;
-	}
-
-	/* Check if there was a partial line. Ignore pure header. */
-	if (start < end && !header) {
-		static const char newline[] = KERN_CONT "\n";
-
-		printk_safe_flush_line(start, end - start);
-		printk_safe_flush_line(newline, strlen(newline));
-	}
-
-	return len;
-}
-
-static void report_message_lost(struct printk_safe_seq_buf *s)
-{
-	int lost = atomic_xchg(&s->message_lost, 0);
-
-	if (lost)
-		printk_deferred("Lost %d message(s)!\n", lost);
-}
-
-/*
- * Flush data from the associated per-CPU buffer. The function
- * can be called either via IRQ work or independently.
- */
-static void __printk_safe_flush(struct irq_work *work)
-{
-	struct printk_safe_seq_buf *s =
-		container_of(work, struct printk_safe_seq_buf, work);
-	unsigned long flags;
-	size_t len;
-	int i;
-
-	/*
-	 * The lock has two functions. First, one reader has to flush all
-	 * available message to make the lockless synchronization with
-	 * writers easier. Second, we do not want to mix messages from
-	 * different CPUs. This is especially important when printing
-	 * a backtrace.
-	 */
-	raw_spin_lock_irqsave(&safe_read_lock, flags);
-
-	i = 0;
-more:
-	len = atomic_read(&s->len);
-
-	/*
-	 * This is just a paranoid check that nobody has manipulated
-	 * the buffer an unexpected way. If we printed something then
-	 * @len must only increase. Also it should never overflow the
-	 * buffer size.
-	 */
-	if ((i && i >= len) || len > sizeof(s->buffer)) {
-		const char *msg = "printk_safe_flush: internal error\n";
-
-		printk_safe_flush_line(msg, strlen(msg));
-		len = 0;
-	}
-
-	if (!len)
-		goto out; /* Someone else has already flushed the buffer. */
-
-	/* Make sure that data has been written up to the @len */
-	smp_rmb();
-	i += printk_safe_flush_buffer(s->buffer + i, len - i);
-
-	/*
-	 * Check that nothing has got added in the meantime and truncate
-	 * the buffer. Note that atomic_cmpxchg() is an implicit memory
-	 * barrier that makes sure that the data were copied before
-	 * updating s->len.
-	 */
-	if (atomic_cmpxchg(&s->len, len, 0) != len)
-		goto more;
-
-out:
-	report_message_lost(s);
-	raw_spin_unlock_irqrestore(&safe_read_lock, flags);
-}
-
-/**
- * printk_safe_flush - flush all per-cpu nmi buffers.
- *
- * The buffers are flushed automatically via IRQ work. This function
- * is useful only when someone wants to be sure that all buffers have
- * been flushed at some point.
- */
-void printk_safe_flush(void)
-{
-	int cpu;
-
-	for_each_possible_cpu(cpu) {
-#ifdef CONFIG_PRINTK_NMI
-		__printk_safe_flush(&per_cpu(nmi_print_seq, cpu).work);
-#endif
-		__printk_safe_flush(&per_cpu(safe_print_seq, cpu).work);
-	}
-}
-
-/**
- * printk_safe_flush_on_panic - flush all per-cpu nmi buffers when the system
- *	goes down.
- *
- * Similar to printk_safe_flush() but it can be called even in NMI context when
- * the system goes down. It does the best effort to get NMI messages into
- * the main ring buffer.
- *
- * Note that it could try harder when there is only one CPU online.
- */
-void printk_safe_flush_on_panic(void)
-{
-	/*
-	 * Make sure that we could access the safe buffers.
-	 * Do not risk a double release when more CPUs are up.
-	 */
-	if (raw_spin_is_locked(&safe_read_lock)) {
-		if (num_online_cpus() > 1)
-			return;
-
-		debug_locks_off();
-		raw_spin_lock_init(&safe_read_lock);
-	}
-
-	printk_safe_flush();
-}
-
 #ifdef CONFIG_PRINTK_NMI
-/*
- * Safe printk() for NMI context. It uses a per-CPU buffer to
- * store the message. NMIs are not nested, so there is always only
- * one writer running. But the buffer might get flushed from another
- * CPU, so we need to be careful.
- */
-static __printf(1, 0) int vprintk_nmi(const char *fmt, va_list args)
-{
-	struct printk_safe_seq_buf *s = this_cpu_ptr(&nmi_print_seq);
-
-	return printk_safe_log_store(s, fmt, args);
-}
-
 void noinstr printk_nmi_enter(void)
 {
 	this_cpu_add(printk_context, PRINTK_NMI_CONTEXT_OFFSET);
@@ -309,9 +32,6 @@ void noinstr printk_nmi_exit(void)
  * Marks a code that might produce many messages in NMI context
  * and the risk of losing them is more critical than eventual
  * reordering.
- *
- * It has effect only when called in NMI context. Then printk()
- * will store the messages into the main logbuf directly.
  */
 void printk_nmi_direct_enter(void)
 {
@@ -324,27 +44,8 @@ void printk_nmi_direct_exit(void)
 	this_cpu_and(printk_context, ~PRINTK_NMI_DIRECT_CONTEXT_MASK);
 }
 
-#else
-
-static __printf(1, 0) int vprintk_nmi(const char *fmt, va_list args)
-{
-	return 0;
-}
-
 #endif /* CONFIG_PRINTK_NMI */
 
-/*
- * Lock-less printk(), to avoid deadlocks should the printk() recurse
- * into itself. It uses a per-CPU buffer to store the message, just like
- * NMI.
- */
-static __printf(1, 0) int vprintk_safe(const char *fmt, va_list args)
-{
-	struct printk_safe_seq_buf *s = this_cpu_ptr(&safe_print_seq);
-
-	return printk_safe_log_store(s, fmt, args);
-}
-
 /* Can be preempted by NMI. */
 void __printk_safe_enter(void)
 {
@@ -369,7 +70,10 @@ __printf(1, 0) int vprintk_func(const char *fmt, va_list args)
 	 * Use the main logbuf even in NMI. But avoid calling console
 	 * drivers that might have their own locks.
 	 */
-	if ((this_cpu_read(printk_context) & PRINTK_NMI_DIRECT_CONTEXT_MASK)) {
+	if (this_cpu_read(printk_context) &
+	    (PRINTK_NMI_DIRECT_CONTEXT_MASK |
+	     PRINTK_NMI_CONTEXT_MASK |
+	     PRINTK_SAFE_CONTEXT_MASK)) {
 		unsigned long flags;
 		int len;
 
@@ -380,34 +84,6 @@ __printf(1, 0) int vprintk_func(const char *fmt, va_list args)
 		return len;
 	}
 
-	/* Use extra buffer in NMI. */
-	if (this_cpu_read(printk_context) & PRINTK_NMI_CONTEXT_MASK)
-		return vprintk_nmi(fmt, args);
-
-	/* Use extra buffer to prevent a recursion deadlock in safe mode. */
-	if (this_cpu_read(printk_context) & PRINTK_SAFE_CONTEXT_MASK)
-		return vprintk_safe(fmt, args);
-
 	/* No obstacles. */
 	return vprintk_default(fmt, args);
 }
-
-void __init printk_safe_init(void)
-{
-	int cpu;
-
-	for_each_possible_cpu(cpu) {
-		struct printk_safe_seq_buf *s;
-
-		s = &per_cpu(safe_print_seq, cpu);
-		init_irq_work(&s->work, __printk_safe_flush);
-
-#ifdef CONFIG_PRINTK_NMI
-		s = &per_cpu(nmi_print_seq, cpu);
-		init_irq_work(&s->work, __printk_safe_flush);
-#endif
-	}
-
-	/* Flush pending messages that did not have scheduled IRQ works. */
-	printk_safe_flush();
-}
diff --git a/lib/nmi_backtrace.c b/lib/nmi_backtrace.c
index 8abe1870dba4..b09a490f5f70 100644
--- a/lib/nmi_backtrace.c
+++ b/lib/nmi_backtrace.c
@@ -75,12 +75,6 @@ void nmi_trigger_cpumask_backtrace(const cpumask_t *mask,
 		touch_softlockup_watchdog();
 	}
 
-	/*
-	 * Force flush any remote buffers that might be stuck in IRQ context
-	 * and therefore could not run their irq_work.
-	 */
-	printk_safe_flush();
-
 	clear_bit_unlock(0, &backtrace_flag);
 	put_cpu();
 }
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH next v1 3/3] printk: convert @syslog_lock to spin_lock
  2021-03-16 23:33 [PATCH next v1 0/3] printk: remove safe buffers John Ogness
  2021-03-16 23:33 ` [PATCH next v1 1/3] printk: track/limit recursion John Ogness
  2021-03-16 23:33 ` [PATCH next v1 2/3] printk: remove safe buffers John Ogness
@ 2021-03-16 23:33 ` John Ogness
  2021-03-23 12:01   ` Petr Mladek
  2 siblings, 1 reply; 23+ messages in thread
From: John Ogness @ 2021-03-16 23:33 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Steven Rostedt,
	Thomas Gleixner, linux-kernel

@syslog_log was a raw_spin_lock to simplify the transition of
removing @logbuf_lock and the safe buffers. With that transition
complete, @syslog_log can become a spin_lock.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
 kernel/printk/printk.c | 30 +++++++++++++++---------------
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index fa52a5daa232..1e38174583c5 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -356,7 +356,7 @@ enum log_flags {
 };
 
 /* syslog_lock protects syslog_* variables and write access to clear_seq. */
-static DEFINE_RAW_SPINLOCK(syslog_lock);
+static DEFINE_SPINLOCK(syslog_lock);
 
 #ifdef CONFIG_PRINTK
 DECLARE_WAIT_QUEUE_HEAD(log_wait);
@@ -1478,9 +1478,9 @@ static int syslog_print(char __user *buf, int size)
 		size_t n;
 		size_t skip;
 
-		raw_spin_lock_irq(&syslog_lock);
+		spin_lock_irq(&syslog_lock);
 		if (!prb_read_valid(prb, syslog_seq, &r)) {
-			raw_spin_unlock_irq(&syslog_lock);
+			spin_unlock_irq(&syslog_lock);
 			break;
 		}
 		if (r.info->seq != syslog_seq) {
@@ -1509,7 +1509,7 @@ static int syslog_print(char __user *buf, int size)
 			syslog_partial += n;
 		} else
 			n = 0;
-		raw_spin_unlock_irq(&syslog_lock);
+		spin_unlock_irq(&syslog_lock);
 
 		if (!n)
 			break;
@@ -1573,9 +1573,9 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
 	}
 
 	if (clear) {
-		raw_spin_lock_irq(&syslog_lock);
+		spin_lock_irq(&syslog_lock);
 		latched_seq_write(&clear_seq, seq);
-		raw_spin_unlock_irq(&syslog_lock);
+		spin_unlock_irq(&syslog_lock);
 	}
 
 	kfree(text);
@@ -1584,9 +1584,9 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
 
 static void syslog_clear(void)
 {
-	raw_spin_lock_irq(&syslog_lock);
+	spin_lock_irq(&syslog_lock);
 	latched_seq_write(&clear_seq, prb_next_seq(prb));
-	raw_spin_unlock_irq(&syslog_lock);
+	spin_unlock_irq(&syslog_lock);
 }
 
 /* Return a consistent copy of @syslog_seq. */
@@ -1594,9 +1594,9 @@ static u64 read_syslog_seq_irq(void)
 {
 	u64 seq;
 
-	raw_spin_lock_irq(&syslog_lock);
+	spin_lock_irq(&syslog_lock);
 	seq = syslog_seq;
-	raw_spin_unlock_irq(&syslog_lock);
+	spin_unlock_irq(&syslog_lock);
 
 	return seq;
 }
@@ -1674,10 +1674,10 @@ int do_syslog(int type, char __user *buf, int len, int source)
 		break;
 	/* Number of chars in the log buffer */
 	case SYSLOG_ACTION_SIZE_UNREAD:
-		raw_spin_lock_irq(&syslog_lock);
+		spin_lock_irq(&syslog_lock);
 		if (!prb_read_valid_info(prb, syslog_seq, &info, NULL)) {
 			/* No unread messages. */
-			raw_spin_unlock_irq(&syslog_lock);
+			spin_unlock_irq(&syslog_lock);
 			return 0;
 		}
 		if (info.seq != syslog_seq) {
@@ -1705,7 +1705,7 @@ int do_syslog(int type, char __user *buf, int len, int source)
 			}
 			error -= syslog_partial;
 		}
-		raw_spin_unlock_irq(&syslog_lock);
+		spin_unlock_irq(&syslog_lock);
 		break;
 	/* Size of the log buffer */
 	case SYSLOG_ACTION_SIZE_BUFFER:
@@ -3013,9 +3013,9 @@ void register_console(struct console *newcon)
 		exclusive_console_stop_seq = console_seq;
 
 		/* Get a consistent copy of @syslog_seq. */
-		raw_spin_lock_irqsave(&syslog_lock, flags);
+		spin_lock_irqsave(&syslog_lock, flags);
 		console_seq = syslog_seq;
-		raw_spin_unlock_irqrestore(&syslog_lock, flags);
+		spin_unlock_irqrestore(&syslog_lock, flags);
 	}
 	console_unlock();
 	console_sysfs_notify();
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH next v1 2/3] printk: remove safe buffers
  2021-03-16 23:33 ` [PATCH next v1 2/3] printk: remove safe buffers John Ogness
@ 2021-03-21  5:26   ` Sergey Senozhatsky
  2021-03-22 11:16     ` John Ogness
  2021-03-23 10:47   ` Petr Mladek
  1 sibling, 1 reply; 23+ messages in thread
From: Sergey Senozhatsky @ 2021-03-21  5:26 UTC (permalink / raw)
  To: John Ogness
  Cc: Petr Mladek, Steven Rostedt, Thomas Gleixner, linux-kernel,
	Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	Eric Biederman, Nicholas Piggin, Christophe Leroy,
	Alistair Popple, Jordan Niethe, Peter Zijlstra,
	Cédric Le Goater, Andrew Morton, Kees Cook, Yue Hu,
	Alexey Kardashevskiy, Rafael Aquini, Tiezhu Yang,
	Guilherme G. Piccoli, Paul E. McKenney, linuxppc-dev, kexec

On (21/03/17 00:33), John Ogness wrote:
[..]
>  void printk_nmi_direct_enter(void)
>  {
> @@ -324,27 +44,8 @@ void printk_nmi_direct_exit(void)
>  	this_cpu_and(printk_context, ~PRINTK_NMI_DIRECT_CONTEXT_MASK);
>  }
>  
> -#else
> -
> -static __printf(1, 0) int vprintk_nmi(const char *fmt, va_list args)
> -{
> -	return 0;
> -}
> -
>  #endif /* CONFIG_PRINTK_NMI */
>  
> -/*
> - * Lock-less printk(), to avoid deadlocks should the printk() recurse
> - * into itself. It uses a per-CPU buffer to store the message, just like
> - * NMI.
> - */
> -static __printf(1, 0) int vprintk_safe(const char *fmt, va_list args)
> -{
> -	struct printk_safe_seq_buf *s = this_cpu_ptr(&safe_print_seq);
> -
> -	return printk_safe_log_store(s, fmt, args);
> -}
> -
>  /* Can be preempted by NMI. */
>  void __printk_safe_enter(void)
>  {
> @@ -369,7 +70,10 @@ __printf(1, 0) int vprintk_func(const char *fmt, va_list args)
>  	 * Use the main logbuf even in NMI. But avoid calling console
>  	 * drivers that might have their own locks.
>  	 */
> -	if ((this_cpu_read(printk_context) & PRINTK_NMI_DIRECT_CONTEXT_MASK)) {
> +	if (this_cpu_read(printk_context) &
> +	    (PRINTK_NMI_DIRECT_CONTEXT_MASK |
> +	     PRINTK_NMI_CONTEXT_MASK |
> +	     PRINTK_SAFE_CONTEXT_MASK)) {

Do we need printk_nmi_direct_enter/exit() and PRINTK_NMI_DIRECT_CONTEXT_MASK?
Seems like all printk_safe() paths are now DIRECT - we store messages to the
prb, but don't call console drivers.

	-ss

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH next v1 1/3] printk: track/limit recursion
  2021-03-16 23:33 ` [PATCH next v1 1/3] printk: track/limit recursion John Ogness
@ 2021-03-21  5:34   ` Sergey Senozhatsky
  2021-03-22 10:53     ` John Ogness
  2021-03-22 14:49   ` Petr Mladek
  1 sibling, 1 reply; 23+ messages in thread
From: Sergey Senozhatsky @ 2021-03-21  5:34 UTC (permalink / raw)
  To: John Ogness
  Cc: Petr Mladek, Sergey Senozhatsky, Sergey Senozhatsky,
	Steven Rostedt, Thomas Gleixner, linux-kernel

On (21/03/17 00:33), John Ogness wrote:
[..]
>  static inline void printk_delay(void)
> @@ -2040,11 +2105,13 @@ int vprintk_store(int facility, int level,
>  	struct prb_reserved_entry e;
>  	enum log_flags lflags = 0;
>  	struct printk_record r;
> +	unsigned long irqflags;
>  	u16 trunc_msg_len = 0;
>  	char prefix_buf[8];
>  	u16 reserve_size;
>  	va_list args2;
>  	u16 text_len;
> +	int ret = 0;
>  	u64 ts_nsec;
>  
>  	/*
> @@ -2055,6 +2122,9 @@ int vprintk_store(int facility, int level,
>  	 */
>  	ts_nsec = local_clock();
>  
> +	if (!printk_enter_irqsave(&irqflags))
> +		return 0;

I guess it can be interesting to somehow signal us that we had printk()
recursion overflow, and how many messages we lost.

3 levels of recursion seem like reasonable limit, but I maybe wouldn't
mind one extra level. And maybe we could add some sort of message prefix
for high levels of recursion nesting (levels 3+), so that things should
not be normal will be on the radars and, possibly, will be reported.

	-ss

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH next v1 1/3] printk: track/limit recursion
  2021-03-21  5:34   ` Sergey Senozhatsky
@ 2021-03-22 10:53     ` John Ogness
  2021-03-22 11:13       ` Sergey Senozhatsky
  0 siblings, 1 reply; 23+ messages in thread
From: John Ogness @ 2021-03-22 10:53 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Sergey Senozhatsky, Sergey Senozhatsky,
	Steven Rostedt, Thomas Gleixner, linux-kernel

On 2021-03-21, Sergey Senozhatsky <senozhatsky@chromium.org> wrote:
>> @@ -2055,6 +2122,9 @@ int vprintk_store(int facility, int level,
>>  	 */
>>  	ts_nsec = local_clock();
>>  
>> +	if (!printk_enter_irqsave(&irqflags))
>> +		return 0;
>
> I guess it can be interesting to somehow signal us that we had
> printk() recursion overflow, and how many messages we lost.

Honestly, if we hit 3 levels of recursion, we are probably dealing with
an infinite recursion issue. I do not see the value of counting the
overflows in that case. The logged messages at that recursion level
would ben enough to point us to the problem.

> 3 levels of recursion seem like reasonable limit, but I maybe wouldn't
> mind one extra level.

With 3 levels, we will see all the messages of:

    printk -> WARN_ON -> WARN_ON -> WARN_ON

Keep in mind that each additional level causes the reading of the logs
to be significantly more complex. Each level increases the output
exponentially:

   for every line1 in 1st_WARN_ON {
      for every line2 in 2nd_WARN_ON {
         for every line3 in 3rd_WARN_ON {
            print $line3
         }
         print $line2
      }
      print $line1
   }
   print $line0

IMHO 2 levels is enough because we should _never_ hit 2 levels of
recursion. If we do, the log output at that second level should be
enough to point to the bug. IMHO printing a third level just makes
things unnecessarily difficult to read. (My series uses 3 levels as a
compromise on my part. I would prefer reducing it to 2.)

> And maybe we could add some sort of message prefix for high levels of
> recursion nesting (levels 3+), so that things should not be normal
> will be on the radars and, possibly, will be reported.

I considered this, but am very hesitant to change the output
format. Also, the CUT_HERE usage (combined with PRINTK_CALLER) seem to
be enough.

John Ogness

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH next v1 1/3] printk: track/limit recursion
  2021-03-22 10:53     ` John Ogness
@ 2021-03-22 11:13       ` Sergey Senozhatsky
  2021-03-22 15:07         ` Petr Mladek
  0 siblings, 1 reply; 23+ messages in thread
From: Sergey Senozhatsky @ 2021-03-22 11:13 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Petr Mladek, Sergey Senozhatsky,
	Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel

On (21/03/22 11:53), John Ogness wrote:
> On 2021-03-21, Sergey Senozhatsky <senozhatsky@chromium.org> wrote:
> >> @@ -2055,6 +2122,9 @@ int vprintk_store(int facility, int level,
> >>  	 */
> >>  	ts_nsec = local_clock();
> >>  
> >> +	if (!printk_enter_irqsave(&irqflags))
> >> +		return 0;
> >
> > I guess it can be interesting to somehow signal us that we had
> > printk() recursion overflow, and how many messages we lost.
> 
> Honestly, if we hit 3 levels of recursion, we are probably dealing with
> an infinite recursion issue.

I tend to agree.

> I do not see the value of counting the overflows in that case. The
> logged messages at that recursion level would ben enough to point
> us to the problem.
>
> > 3 levels of recursion seem like reasonable limit, but I maybe wouldn't
> > mind one extra level.
>
> With 3 levels, we will see all the messages of:
>
>     printk -> WARN_ON -> WARN_ON -> WARN_ON

Well, not necessarily this simple.

printk
 vsprintf
  handle_foo_specifier
   printk
    call_console_drivers
     timekeeping
      printk
       vsprintf

We saw in the past that enabling CONFIG_DEBUG_OBJECTS (if I'm not
mistaken) can add quite a bit of extra printk recursion paths.

We also have other CONFIG_DEBUG_* config options that can pop up as
recursive printk-s here and there. For instance, from vsprintf::foo_specifier()
where we escape from printk() to various kernel subsystems: net, block,
etc.

Maybe sometimes on level 3+ we'll see something interesting,
but I've no strong opinion on this.

> Keep in mind that each additional level causes the reading of the logs
> to be significantly more complex. Each level increases the output
> exponentially:

Yes, I realize that. That's why I suggested that maybe recursive
printk-s can have some special extra prefix. Recursive printk-s
will interleave with whatever is being printed on this_cpu, so
prefix might be helpful.

	-ss

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH next v1 2/3] printk: remove safe buffers
  2021-03-21  5:26   ` Sergey Senozhatsky
@ 2021-03-22 11:16     ` John Ogness
  2021-03-22 18:02       ` Petr Mladek
  0 siblings, 1 reply; 23+ messages in thread
From: John Ogness @ 2021-03-22 11:16 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Steven Rostedt, Thomas Gleixner, linux-kernel,
	Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	Eric Biederman, Nicholas Piggin, Christophe Leroy,
	Alistair Popple, Jordan Niethe, Peter Zijlstra,
	Cédric Le Goater, Andrew Morton, Kees Cook, Yue Hu,
	Alexey Kardashevskiy, Rafael Aquini, Tiezhu Yang,
	Guilherme G. Piccoli, Paul E. McKenney, linuxppc-dev, kexec

On 2021-03-21, Sergey Senozhatsky <senozhatsky@chromium.org> wrote:
>> @@ -369,7 +70,10 @@ __printf(1, 0) int vprintk_func(const char *fmt, va_list args)
>>  	 * Use the main logbuf even in NMI. But avoid calling console
>>  	 * drivers that might have their own locks.
>>  	 */
>> -	if ((this_cpu_read(printk_context) & PRINTK_NMI_DIRECT_CONTEXT_MASK)) {
>> +	if (this_cpu_read(printk_context) &
>> +	    (PRINTK_NMI_DIRECT_CONTEXT_MASK |
>> +	     PRINTK_NMI_CONTEXT_MASK |
>> +	     PRINTK_SAFE_CONTEXT_MASK)) {
>
> Do we need printk_nmi_direct_enter/exit() and
> PRINTK_NMI_DIRECT_CONTEXT_MASK?  Seems like all printk_safe() paths
> are now DIRECT - we store messages to the prb, but don't call console
> drivers.

I was planning on waiting until the kthreads are introduced, in which
case printk_safe.c is completely removed. But I suppose I could switch
the 1 printk_nmi_direct_enter() user to printk_nmi_enter() so that
PRINTK_NMI_DIRECT_CONTEXT_MASK can be removed now. I would do this in a
4th patch of the series.

John Ogness

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH next v1 1/3] printk: track/limit recursion
  2021-03-16 23:33 ` [PATCH next v1 1/3] printk: track/limit recursion John Ogness
  2021-03-21  5:34   ` Sergey Senozhatsky
@ 2021-03-22 14:49   ` Petr Mladek
  2021-03-23 21:32     ` John Ogness
  1 sibling, 1 reply; 23+ messages in thread
From: Petr Mladek @ 2021-03-22 14:49 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Steven Rostedt,
	Thomas Gleixner, linux-kernel

On Wed 2021-03-17 00:33:24, John Ogness wrote:
> Track printk() recursion and limit it to 3 levels per-CPU and per-context.

Please, explain why it is added. I mean that it will
allow remove printk_safe that provides recursion protection at the
moment.

> Signed-off-by: John Ogness <john.ogness@linutronix.de>
> ---
>  kernel/printk/printk.c | 80 ++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 77 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 2f829fbf0a13..c666e3e43f0c 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -1940,6 +1940,71 @@ static void call_console_drivers(const char *ext_text, size_t ext_len,
>  	}
>  }
>  
> +/*
> + * Recursion is tracked separately on each CPU. If NMIs are supported, an
> + * additional NMI context per CPU is also separately tracked. Until per-CPU
> + * is available, a separate "early tracking" is performed.
> + */
> +#ifdef CONFIG_PRINTK_NMI

CONFIG_PRINTK_NMI is a shortcut for CONFIG_PRINTK && CONFIG_HAVE_NMI.
It should be possible to use CONFIG_HAVE_NMI here because this should
be in section where CONFIG_PRINTK is defined.

This would make sense if it allows to remove CONFIG_PRINTK_NMI
entirely. IMHO, it would be nice to remove one layer in the
config options of possible.


> +#define PRINTK_CTX_NUM 2
> +#else
> +#define PRINTK_CTX_NUM 1
> +#endif
> +static DEFINE_PER_CPU(char [PRINTK_CTX_NUM], printk_count);
> +static char printk_count_early[PRINTK_CTX_NUM];
> +
> +/*
> + * Recursion is limited to keep the output sane. printk() should not require
> + * more than 1 level of recursion (allowing, for example, printk() to trigger
> + * a WARN), but a higher value is used in case some printk-internal errors
> + * exist, such as the ringbuffer validation checks failing.
> + */
> +#define PRINTK_MAX_RECURSION 3
> +
> +/* Return a pointer to the dedicated counter for the CPU+context of the caller. */
> +static char *printk_recursion_counter(void)
> +{
> +	int ctx = 0;
> +
> +#ifdef CONFIG_PRINTK_NMI
> +	if (in_nmi())
> +		ctx = 1;
> +#endif
> +	if (!printk_percpu_data_ready())
> +		return &printk_count_early[ctx];
> +	return &((*this_cpu_ptr(&printk_count))[ctx]);
> +}

It is not a big deal. But using an array for two contexts looks strange
especially when only one is used on some architectures.
Also &((*this_cpu_ptr(&printk_count))[ctx]) is quite tricky ;-)

What do you think about the following, please?

static DEFINE_PER_CPU(u8 printk_count);
static u8 printk_count_early;

#ifdef CONFIG_HAVE_NMI
static DEFINE_PER_CPU(u8 printk_count_nmi);
static u8 printk_count_nmi_early;
#endif

static u8 *printk_recursion_counter(void)
{
	if (IS_ENABLED(CONFIG_HAVE_NMI) && in_nmi()) {
		if (printk_cpu_data_ready())
			return this_cpu_ptr(&printk_count_nmi);
		return printk_count_nmi_early;
	}

	if (printk_cpu_data_ready())
		return this_cpu_ptr(&printk_count);
	return printk_count_early;
}


Otherwise, it looks good to me. I like the simplicity.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH next v1 1/3] printk: track/limit recursion
  2021-03-22 11:13       ` Sergey Senozhatsky
@ 2021-03-22 15:07         ` Petr Mladek
  0 siblings, 0 replies; 23+ messages in thread
From: Petr Mladek @ 2021-03-22 15:07 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: John Ogness, Sergey Senozhatsky, Sergey Senozhatsky,
	Steven Rostedt, Thomas Gleixner, linux-kernel

On Mon 2021-03-22 20:13:51, Sergey Senozhatsky wrote:
> On (21/03/22 11:53), John Ogness wrote:
> > On 2021-03-21, Sergey Senozhatsky <senozhatsky@chromium.org> wrote:
> > >> @@ -2055,6 +2122,9 @@ int vprintk_store(int facility, int level,
> > >>  	 */
> > >>  	ts_nsec = local_clock();
> > >>  
> > >> +	if (!printk_enter_irqsave(&irqflags))
> > >> +		return 0;
> > >
> > > I guess it can be interesting to somehow signal us that we had
> > > printk() recursion overflow, and how many messages we lost.
> > 
> > Honestly, if we hit 3 levels of recursion, we are probably dealing with
> > an infinite recursion issue.
> 
> I tend to agree.
> 
> > I do not see the value of counting the overflows in that case. The
> > logged messages at that recursion level would ben enough to point
> > us to the problem.
> >
> > > 3 levels of recursion seem like reasonable limit, but I maybe wouldn't
> > > mind one extra level.
> >
> > With 3 levels, we will see all the messages of:
> >
> >     printk -> WARN_ON -> WARN_ON -> WARN_ON

This is the worst case. Many messages are just a single line or
we use WARN_ON_ONCE.


> Well, not necessarily this simple.
> 
> printk
>  vsprintf
>   handle_foo_specifier
>    printk
>     call_console_drivers
>      timekeeping
>       printk
>        vsprintf
> 
> We saw in the past that enabling CONFIG_DEBUG_OBJECTS (if I'm not
> mistaken) can add quite a bit of extra printk recursion paths.
> 
> We also have other CONFIG_DEBUG_* config options that can pop up as
> recursive printk-s here and there. For instance, from vsprintf::foo_specifier()
> where we escape from printk() to various kernel subsystems: net, block,
> etc.
> 
> Maybe sometimes on level 3+ we'll see something interesting,
> but I've no strong opinion on this.

Honestly, my first reaction was the same as Sergey's. But John has
a point that too many levels might be hard to read.

I think that 3 levels are a good start. We could always change it when we
hit a problem in practice.

> > Keep in mind that each additional level causes the reading of the logs
> > to be significantly more complex. Each level increases the output
> > exponentially:
> 
> Yes, I realize that. That's why I suggested that maybe recursive
> printk-s can have some special extra prefix. Recursive printk-s
> will interleave with whatever is being printed on this_cpu, so
> prefix might be helpful.

This is an interesting area, definitely. I am not sure if it is worth
it though.

I would keep it simple and cut output on 3rd level for now. We could
always improve it when we hit a problem in the real life.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH next v1 2/3] printk: remove safe buffers
  2021-03-22 11:16     ` John Ogness
@ 2021-03-22 18:02       ` Petr Mladek
  2021-03-22 21:58         ` John Ogness
  0 siblings, 1 reply; 23+ messages in thread
From: Petr Mladek @ 2021-03-22 18:02 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Michael Ellerman, Benjamin Herrenschmidt,
	Paul Mackerras, Eric Biederman, Nicholas Piggin,
	Christophe Leroy, Alistair Popple, Jordan Niethe, Peter Zijlstra,
	Cédric Le Goater, Andrew Morton, Kees Cook, Yue Hu,
	Alexey Kardashevskiy, Rafael Aquini, Tiezhu Yang,
	Guilherme G. Piccoli, Paul E. McKenney, linuxppc-dev, kexec

On Mon 2021-03-22 12:16:15, John Ogness wrote:
> On 2021-03-21, Sergey Senozhatsky <senozhatsky@chromium.org> wrote:
> >> @@ -369,7 +70,10 @@ __printf(1, 0) int vprintk_func(const char *fmt, va_list args)
> >>  	 * Use the main logbuf even in NMI. But avoid calling console
> >>  	 * drivers that might have their own locks.
> >>  	 */
> >> -	if ((this_cpu_read(printk_context) & PRINTK_NMI_DIRECT_CONTEXT_MASK)) {
> >> +	if (this_cpu_read(printk_context) &
> >> +	    (PRINTK_NMI_DIRECT_CONTEXT_MASK |
> >> +	     PRINTK_NMI_CONTEXT_MASK |
> >> +	     PRINTK_SAFE_CONTEXT_MASK)) {
> >
> > Do we need printk_nmi_direct_enter/exit() and
> > PRINTK_NMI_DIRECT_CONTEXT_MASK?  Seems like all printk_safe() paths
> > are now DIRECT - we store messages to the prb, but don't call console
> > drivers.
>
> I was planning on waiting until the kthreads are introduced, in which
> case printk_safe.c is completely removed.

You want to keep printk_safe() context because it prevents calling
consoles even in normal context. Namely, it prevents deadlock by
recursively taking, for example, sem->lock in console_lock() or
console_owner_lock in console_trylock_spinning(). Am I right?


> But I suppose I could switch
> the 1 printk_nmi_direct_enter() user to printk_nmi_enter() so that
> PRINTK_NMI_DIRECT_CONTEXT_MASK can be removed now. I would do this in a
> 4th patch of the series.

Yes, please unify the PRINTK_NMI_CONTEXT. One is enough.

I wonder if it would make sense to go even further at this stage.
There will still be 4 contexts that modify the printk behavior after
this patchset:

  + printk_count set by printk_enter()/exit()
      + prevents: infinite recursion
      + context: any context
      + action: skips entire printk at 3rd recursion level

  + prink_context set by printk_safe_enter()/exit()
      + prevents: dead lock caused by recursion into some
		console code in any context
      + context: any
      + action: skips console call at 1st recursion level

  + printk_context set by printk_nmi_enter()/exit()
      + prevents: dead lock caused by any console lock recursion
      + context: NMI
      + action: skips console calls at 0th recursion level

  + kdb_trap_printk
      + redirects printk() to kdb_printk() in kdb context


What is possible?

1. We could get rid of printk_nmi_enter()/exit() and
   PRINTK_NMI_CONTEXT completely already now. It is enough
   to check in_nmi() in printk_func().

   printk_nmi_enter() was added by the commit 42a0bb3f71383b457a7db362
   ("printk/nmi: generic solution for safe printk in NMI"). It was
   really needed to modify @printk_func pointer.

   We did not remove it later when printk_function became a real
   function. The idea was to track all printk contexts in a single
   variable. But we never added kdb context.

   It might make sense to remove it now. Peter Zijstra would be happy.
   There already were some churns with tracking printk_context in NMI.
   For example, see
   https://lore.kernel.org/r/20200219150744.428764577@infradead.org

   IMHO, it does not make sense to wait until the entire console-stuff
   rework is done in this case.


2. I thought about unifying printk_safe_enter()/exit() and
   printk_enter()/exit(). They both count recursion with
   IRQs disabled, have similar name. But they are used
   different way.

   But better might be to rename printk_safe_enter()/exit() to
   console_enter()/exit() or to printk_deferred_enter()/exit().
   It would make more clear what it does now. And it might help
   to better distinguish it from the new printk_enter()/exit().

   This patchset actually splits the original printk_safe()
   functionality into two:

       + printk_count prevents infinite recursion
       + printk_deferred_enter() deffers console handling.

   I am not sure if it is worth it. But it might help people (even me)
   when digging into the printk history. Different name will help to
   understand the functionality at the given time.


What do you think, please?

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH next v1 2/3] printk: remove safe buffers
  2021-03-22 18:02       ` Petr Mladek
@ 2021-03-22 21:58         ` John Ogness
  2021-03-23  9:46           ` Petr Mladek
  0 siblings, 1 reply; 23+ messages in thread
From: John Ogness @ 2021-03-22 21:58 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Michael Ellerman, Benjamin Herrenschmidt,
	Paul Mackerras, Eric Biederman, Nicholas Piggin,
	Christophe Leroy, Alistair Popple, Jordan Niethe, Peter Zijlstra,
	Cédric Le Goater, Andrew Morton, Kees Cook, Yue Hu,
	Alexey Kardashevskiy, Rafael Aquini, Tiezhu Yang,
	Guilherme G. Piccoli, Paul E. McKenney, linuxppc-dev, kexec

On 2021-03-22, Petr Mladek <pmladek@suse.com> wrote:
> On Mon 2021-03-22 12:16:15, John Ogness wrote:
>> On 2021-03-21, Sergey Senozhatsky <senozhatsky@chromium.org> wrote:
>> >> @@ -369,7 +70,10 @@ __printf(1, 0) int vprintk_func(const char *fmt, va_list args)
>> >>  	 * Use the main logbuf even in NMI. But avoid calling console
>> >>  	 * drivers that might have their own locks.
>> >>  	 */
>> >> -	if ((this_cpu_read(printk_context) & PRINTK_NMI_DIRECT_CONTEXT_MASK)) {
>> >> +	if (this_cpu_read(printk_context) &
>> >> +	    (PRINTK_NMI_DIRECT_CONTEXT_MASK |
>> >> +	     PRINTK_NMI_CONTEXT_MASK |
>> >> +	     PRINTK_SAFE_CONTEXT_MASK)) {
>> >
>> > Do we need printk_nmi_direct_enter/exit() and
>> > PRINTK_NMI_DIRECT_CONTEXT_MASK?  Seems like all printk_safe() paths
>> > are now DIRECT - we store messages to the prb, but don't call console
>> > drivers.
>>
>> I was planning on waiting until the kthreads are introduced, in which
>> case printk_safe.c is completely removed.
>
> You want to keep printk_safe() context because it prevents calling
> consoles even in normal context. Namely, it prevents deadlock by
> recursively taking, for example, sem->lock in console_lock() or
> console_owner_lock in console_trylock_spinning(). Am I right?

Correct.

>> But I suppose I could switch
>> the 1 printk_nmi_direct_enter() user to printk_nmi_enter() so that
>> PRINTK_NMI_DIRECT_CONTEXT_MASK can be removed now. I would do this in a
>> 4th patch of the series.
>
> Yes, please unify the PRINTK_NMI_CONTEXT. One is enough.

Agreed. (But I'll go even further. See below.)

> I wonder if it would make sense to go even further at this stage.
> There will still be 4 contexts that modify the printk behavior after
> this patchset:
>
>   + printk_count set by printk_enter()/exit()
>       + prevents: infinite recursion
>       + context: any context
>       + action: skips entire printk at 3rd recursion level
>
>   + prink_context set by printk_safe_enter()/exit()
>       + prevents: dead lock caused by recursion into some
> 		console code in any context
>       + context: any
>       + action: skips console call at 1st recursion level

Technically, at this point printk_safe_enter() behavior is identical to
printk_nmi_enter(). Namely, prevent any recursive printk calls from
calling into the console code.

>   + printk_context set by printk_nmi_enter()/exit()
>       + prevents: dead lock caused by any console lock recursion
>       + context: NMI
>       + action: skips console calls at 0th recursion level
>
>   + kdb_trap_printk
>       + redirects printk() to kdb_printk() in kdb context
>
>
> What is possible?
>
> 1. We could get rid of printk_nmi_enter()/exit() and
>    PRINTK_NMI_CONTEXT completely already now. It is enough
>    to check in_nmi() in printk_func().
>
>    printk_nmi_enter() was added by the commit 42a0bb3f71383b457a7db362
>    ("printk/nmi: generic solution for safe printk in NMI"). It was
>    really needed to modify @printk_func pointer.
>
>    We did not remove it later when printk_function became a real
>    function. The idea was to track all printk contexts in a single
>    variable. But we never added kdb context.
>
>    It might make sense to remove it now. Peter Zijstra would be happy.
>    There already were some churns with tracking printk_context in NMI.
>    For example, see
>    https://lore.kernel.org/r/20200219150744.428764577@infradead.org
>
>    IMHO, it does not make sense to wait until the entire console-stuff
>    rework is done in this case.

Agreed. in_nmi() within vprintk_emit() is enough to detect if the
console code should be skipped:

    if (!in_sched && !in_nmi()) {
        ...
    }

> 2. I thought about unifying printk_safe_enter()/exit() and
>    printk_enter()/exit(). They both count recursion with
>    IRQs disabled, have similar name. But they are used
>    different way.
>
>    But better might be to rename printk_safe_enter()/exit() to
>    console_enter()/exit() or to printk_deferred_enter()/exit().
>    It would make more clear what it does now. And it might help
>    to better distinguish it from the new printk_enter()/exit().
>
>    This patchset actually splits the original printk_safe()
>    functionality into two:
>
>        + printk_count prevents infinite recursion
>        + printk_deferred_enter() deffers console handling.
>
>    I am not sure if it is worth it. But it might help people (even me)
>    when digging into the printk history. Different name will help to
>    understand the functionality at the given time.

I am also not sure if it is worth the extra "noise" just to give the
function a more appropriate name. The plan is to remove it completely
soon anyway. My vote is to leave the name as it is.

But I am willing to do the rename in an addtional patch if you
want. printk_deferred_enter() sounds fine to me. Please confirm if you
want me to do this.

John Ogness

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH next v1 2/3] printk: remove safe buffers
  2021-03-22 21:58         ` John Ogness
@ 2021-03-23  9:46           ` Petr Mladek
  0 siblings, 0 replies; 23+ messages in thread
From: Petr Mladek @ 2021-03-23  9:46 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Michael Ellerman, Benjamin Herrenschmidt,
	Paul Mackerras, Eric Biederman, Nicholas Piggin,
	Christophe Leroy, Alistair Popple, Jordan Niethe, Peter Zijlstra,
	Cédric Le Goater, Andrew Morton, Kees Cook, Yue Hu,
	Alexey Kardashevskiy, Rafael Aquini, Tiezhu Yang,
	Guilherme G. Piccoli, Paul E. McKenney, linuxppc-dev, kexec

On Mon 2021-03-22 22:58:47, John Ogness wrote:
> On 2021-03-22, Petr Mladek <pmladek@suse.com> wrote:
> > On Mon 2021-03-22 12:16:15, John Ogness wrote:
> >> On 2021-03-21, Sergey Senozhatsky <senozhatsky@chromium.org> wrote:
> >> >> @@ -369,7 +70,10 @@ __printf(1, 0) int vprintk_func(const char *fmt, va_list args)
> >> >>  	 * Use the main logbuf even in NMI. But avoid calling console
> >> >>  	 * drivers that might have their own locks.
> >> >>  	 */
> >> >> -	if ((this_cpu_read(printk_context) & PRINTK_NMI_DIRECT_CONTEXT_MASK)) {
> >> >> +	if (this_cpu_read(printk_context) &
> >> >> +	    (PRINTK_NMI_DIRECT_CONTEXT_MASK |
> >> >> +	     PRINTK_NMI_CONTEXT_MASK |
> >> >> +	     PRINTK_SAFE_CONTEXT_MASK)) {
> >> >

> >> But I suppose I could switch
> >> the 1 printk_nmi_direct_enter() user to printk_nmi_enter() so that
> >> PRINTK_NMI_DIRECT_CONTEXT_MASK can be removed now. I would do this in a
> >> 4th patch of the series.
> >
> > Yes, please unify the PRINTK_NMI_CONTEXT. One is enough.
> 
> Agreed. (But I'll go even further. See below.)
> 
> > I wonder if it would make sense to go even further at this stage.
> > What is possible?
> >
> > 1. We could get rid of printk_nmi_enter()/exit() and
> >    PRINTK_NMI_CONTEXT completely already now. It is enough
> >    to check in_nmi() in printk_func().
> >
> 
> Agreed. in_nmi() within vprintk_emit() is enough to detect if the
> console code should be skipped:
> 
>     if (!in_sched && !in_nmi()) {
>         ...
>     }

Well, we also need to make sure that the irq work is scheduled to
call console later. We should keep this dicision in
printk_func(). I mean to replace the current

	if (this_cpu_read(printk_context) &
	    (PRINTK_NMI_DIRECT_CONTEXT_MASK |
	     PRINTK_NMI_CONTEXT_MASK |
	     PRINTK_SAFE_CONTEXT_MASK)) {

with

	/*
	 * Avoid calling console drivers in recursive printk()
	 * and in NMI context.
	 */
	if (this_cpu_read(printk_context) || in_nmi() {

That said, I am not sure how this fits your further rework.
I do not want to complicate it too much.

I am just afraid that the discussion about console rework might
take some time. And this would remove some complexity before we
started the more complicated or controversial changes.


> > 2. I thought about unifying printk_safe_enter()/exit() and
> >    printk_enter()/exit(). They both count recursion with
> >    IRQs disabled, have similar name. But they are used
> >    different way.
> >
> >    But better might be to rename printk_safe_enter()/exit() to
> >    console_enter()/exit() or to printk_deferred_enter()/exit().
> >    It would make more clear what it does now. And it might help
> >    to better distinguish it from the new printk_enter()/exit().
> >
> >    I am not sure if it is worth it.
> 
> I am also not sure if it is worth the extra "noise" just to give the
> function a more appropriate name. The plan is to remove it completely
> soon anyway. My vote is to leave the name as it is.

OK, let's keep printk_safe() name. It was just an idea. I wrote it
primary to sort my thoughts.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH next v1 2/3] printk: remove safe buffers
  2021-03-16 23:33 ` [PATCH next v1 2/3] printk: remove safe buffers John Ogness
  2021-03-21  5:26   ` Sergey Senozhatsky
@ 2021-03-23 10:47   ` Petr Mladek
  2021-03-26 11:12     ` John Ogness
  1 sibling, 1 reply; 23+ messages in thread
From: Petr Mladek @ 2021-03-23 10:47 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Steven Rostedt,
	Thomas Gleixner, linux-kernel, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, Eric Biederman,
	Nicholas Piggin, Christophe Leroy, Alistair Popple,
	Jordan Niethe, Peter Zijlstra, Cédric Le Goater,
	Andrew Morton, Kees Cook, Yue Hu, Alexey Kardashevskiy,
	Rafael Aquini, Tiezhu Yang, Guilherme G. Piccoli,
	Paul E. McKenney, linuxppc-dev, kexec

On Wed 2021-03-17 00:33:25, John Ogness wrote:
> With @logbuf_lock removed, the high level printk functions for
> storing messages are lockless. Messages can be stored from any
> context, so there is no need for the NMI and safe buffers anymore.
> Remove the NMI and safe buffers.
> 
> Although the safe buffers are removed, the NMI and safe context
> tracking is still in place. In these contexts, store the message
> immediately but still use irq_work to defer the console printing.
> 
> Since printk recursion tracking is in place, safe context tracking
> for most of printk is not needed. Remove it. Only safe context
> tracking relating to the console lock is left in place. This is
> because the console lock is needed for the actual printing.

I have two more questions after actually checking the entire patch.
See below.

> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -1084,7 +1069,6 @@ void __init setup_log_buf(int early)
>  	struct printk_record r;
>  	size_t new_descs_size;
>  	size_t new_infos_size;
> -	unsigned long flags;
>  	char *new_log_buf;
>  	unsigned int free;
>  	u64 seq;
> @@ -1142,8 +1126,6 @@ void __init setup_log_buf(int early)
>  		 new_descs, ilog2(new_descs_count),
>  		 new_infos);
>  
> -	printk_safe_enter_irqsave(flags);
> -
>  	log_buf_len = new_log_buf_len;
>  	log_buf = new_log_buf;
>  	new_log_buf_len = 0;
> @@ -1159,8 +1141,6 @@ void __init setup_log_buf(int early)
>  	 */
>  	prb = &printk_rb_dynamic;
>  
> -	printk_safe_exit_irqrestore(flags);

This will allow to add new messages from the IRQ context when we
are copying them to the new buffer. They might get lost in
the small race window.

Also the messages from NMI might get lost because they are not
longer stored in the per-CPU buffer.

A possible solution might be to do something like this:

	prb_for_each_record(0, &printk_rb_static, seq, &r)
		free -= add_to_rb(&printk_rb_dynamic, &r);

	prb = &printk_rb_dynamic;

	/*
	 * Copy the remaining messages that might have appeared
	 * from IRQ or NMI context after we ended copying and
	 * before we switched the buffers. They must be finalized
	 * because only one CPU is up at this stage.
	 */
	prb_for_each_record(seq, &printk_rb_static, seq, &r)
		free -= add_to_rb(&printk_rb_dynamic, &r);


> -
>  	if (seq != prb_next_seq(&printk_rb_static)) {
>  		pr_err("dropped %llu messages\n",
>  		       prb_next_seq(&printk_rb_static) - seq);
> @@ -2666,7 +2631,6 @@ void console_unlock(void)
>  		size_t ext_len = 0;
>  		size_t len;
>  
> -		printk_safe_enter_irqsave(flags);
>  skip:
>  		if (!prb_read_valid(prb, console_seq, &r))
>  			break;
> @@ -2711,6 +2675,8 @@ void console_unlock(void)
>  				printk_time);
>  		console_seq++;
>  
> +		printk_safe_enter_irqsave(flags);

What is the purpose of the printk_safe context here, please?

I guess that you wanted to prevent calling console drivers
recursively. But it is already serialized by console_lock().

IMHO, the only risk is when manipulating console_sem->lock
or console_owner_lock. But they are already guarded by
printk_safe context, for example, in console_lock() or
console_lock_spinning_enable().

Do I miss something, please?


> +
>  		/*
>  		 * While actively printing out messages, if another printk()
>  		 * were to occur on another CPU, it may wait for this one to
> @@ -2745,8 +2711,6 @@ void console_unlock(void)
>  	 * flush, no worries.
>  	 */
>  	retry = prb_read_valid(prb, console_seq, NULL);
> -	printk_safe_exit_irqrestore(flags);
> -
>  	if (retry && console_trylock())
>  		goto again;
>  }

Heh, all these patches feels like stripping printk of an armour. I hope
that we trained it enough to be flexible and avoid any damage.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH next v1 3/3] printk: convert @syslog_lock to spin_lock
  2021-03-16 23:33 ` [PATCH next v1 3/3] printk: convert @syslog_lock to spin_lock John Ogness
@ 2021-03-23 12:01   ` Petr Mladek
  2021-03-26 11:23     ` John Ogness
  0 siblings, 1 reply; 23+ messages in thread
From: Petr Mladek @ 2021-03-23 12:01 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Steven Rostedt,
	Thomas Gleixner, linux-kernel

On Wed 2021-03-17 00:33:26, John Ogness wrote:
> @syslog_log was a raw_spin_lock to simplify the transition of

s/syslog_log/syslog_lock/

Same problem is also below.

> removing @logbuf_lock and the safe buffers. With that transition
> complete, @syslog_log can become a spin_lock.

I know that we already talked about this. But I want to be sure
that this patch makes sense.

It will actually not change the behavior because we always
take the lock with interrupts disabled.

We disable the interrupts because register_console() is called
in IRQ context on parisc in handle_interruption() when it is
going to panic (code == 1 => will call parisc_terminate()).

Disabling IRQ will not help in parisc_terminate(). This code
path is non-maskable and never returns. The deadlock might be
prevented only by trylock.

trylock on syslog_lock is only small problem. Much bigger is
is a deadlock on console_lock. Fixing this is beyond this
patchset.

Summary:

    + disabling IRQ does not help for parisc

    + register_console() is not irq safe in general because
      of the sleeping console_lock.


I would personally prefer to remove both "raw" and "irq"
in this patch and just document the problem with parisc
in the commit message.

IMHO, it does not make sense to keep _irq when it neither
helps nor makes sense.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH next v1 1/3] printk: track/limit recursion
  2021-03-22 14:49   ` Petr Mladek
@ 2021-03-23 21:32     ` John Ogness
  2021-03-24  8:41       ` Petr Mladek
  0 siblings, 1 reply; 23+ messages in thread
From: John Ogness @ 2021-03-23 21:32 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Steven Rostedt,
	Thomas Gleixner, linux-kernel

On 2021-03-22, Petr Mladek <pmladek@suse.com> wrote:
> On Wed 2021-03-17 00:33:24, John Ogness wrote:
>> Track printk() recursion and limit it to 3 levels per-CPU and per-context.
>
> Please, explain why it is added. I mean that it will
> allow remove printk_safe that provides recursion protection at the
> moment.

OK.

>> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
>> index 2f829fbf0a13..c666e3e43f0c 100644
>> --- a/kernel/printk/printk.c
>> +++ b/kernel/printk/printk.c
>> @@ -1940,6 +1940,71 @@ static void call_console_drivers(const char *ext_text, size_t ext_len,
>>  	}
>>  }
>>  
>> +/*
>> + * Recursion is tracked separately on each CPU. If NMIs are supported, an
>> + * additional NMI context per CPU is also separately tracked. Until per-CPU
>> + * is available, a separate "early tracking" is performed.
>> + */
>> +#ifdef CONFIG_PRINTK_NMI
>
> CONFIG_PRINTK_NMI is a shortcut for CONFIG_PRINTK && CONFIG_HAVE_NMI.
> It should be possible to use CONFIG_HAVE_NMI here because this should
> be in section where CONFIG_PRINTK is defined.
>
> This would make sense if it allows to remove CONFIG_PRINTK_NMI
> entirely. IMHO, it would be nice to remove one layer in the
> config options of possible.

OK. I will remove CONFIG_PRINTK_NMI for v2.

>> +#define PRINTK_CTX_NUM 2
>> +#else
>> +#define PRINTK_CTX_NUM 1
>> +#endif
>> +static DEFINE_PER_CPU(char [PRINTK_CTX_NUM], printk_count);
>> +static char printk_count_early[PRINTK_CTX_NUM];
>> +
>> +/*
>> + * Recursion is limited to keep the output sane. printk() should not require
>> + * more than 1 level of recursion (allowing, for example, printk() to trigger
>> + * a WARN), but a higher value is used in case some printk-internal errors
>> + * exist, such as the ringbuffer validation checks failing.
>> + */
>> +#define PRINTK_MAX_RECURSION 3
>> +
>> +/* Return a pointer to the dedicated counter for the CPU+context of the caller. */
>> +static char *printk_recursion_counter(void)
>> +{
>> +	int ctx = 0;
>> +
>> +#ifdef CONFIG_PRINTK_NMI
>> +	if (in_nmi())
>> +		ctx = 1;
>> +#endif
>> +	if (!printk_percpu_data_ready())
>> +		return &printk_count_early[ctx];
>> +	return &((*this_cpu_ptr(&printk_count))[ctx]);
>> +}
>
> It is not a big deal. But using an array for two contexts looks strange
> especially when only one is used on some architectures.
> Also &((*this_cpu_ptr(&printk_count))[ctx]) is quite tricky ;-)
>
> What do you think about the following, please?
>
> static DEFINE_PER_CPU(u8 printk_count);
> static u8 printk_count_early;
>
> #ifdef CONFIG_HAVE_NMI
> static DEFINE_PER_CPU(u8 printk_count_nmi);
> static u8 printk_count_nmi_early;
> #endif
>
> static u8 *printk_recursion_counter(void)
> {
> 	if (IS_ENABLED(CONFIG_HAVE_NMI) && in_nmi()) {
> 		if (printk_cpu_data_ready())
> 			return this_cpu_ptr(&printk_count_nmi);
> 		return printk_count_nmi_early;
> 	}
>
> 	if (printk_cpu_data_ready())
> 		return this_cpu_ptr(&printk_count);
> 	return printk_count_early;
> }

I can split it into explicit variables. But is the use of the IS_ENABLED
macro preferred over ifdef? I would prefer:

static u8 *printk_recursion_counter(void)
{
#ifdef CONFIG_HAVE_NMI
	if (in_nmi()) {
		if (printk_cpu_data_ready())
			return this_cpu_ptr(&printk_count_nmi);
		return printk_count_nmi_early;
	}
#endif
	if (printk_cpu_data_ready())
		return this_cpu_ptr(&printk_count);
	return printk_count_early;
}

Since @printk_count_nmi and @printk_count_nmi_early would not exist, I
would prefer the pre-processor removes that code block rather than
relying on compiler optimization.

John Ogness

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH next v1 1/3] printk: track/limit recursion
  2021-03-23 21:32     ` John Ogness
@ 2021-03-24  8:41       ` Petr Mladek
  0 siblings, 0 replies; 23+ messages in thread
From: Petr Mladek @ 2021-03-24  8:41 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Steven Rostedt,
	Thomas Gleixner, linux-kernel

On Tue 2021-03-23 22:32:00, John Ogness wrote:
> On 2021-03-22, Petr Mladek <pmladek@suse.com> wrote:
> > On Wed 2021-03-17 00:33:24, John Ogness wrote:
> >> Track printk() recursion and limit it to 3 levels per-CPU and per-context.
> >
> >> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> >> index 2f829fbf0a13..c666e3e43f0c 100644
> >> --- a/kernel/printk/printk.c
> >> +++ b/kernel/printk/printk.c
> >> +/* Return a pointer to the dedicated counter for the CPU+context of the caller. */
> >> +static char *printk_recursion_counter(void)
> >> +{
> >> +	int ctx = 0;
> >> +
> >> +#ifdef CONFIG_PRINTK_NMI
> >> +	if (in_nmi())
> >> +		ctx = 1;
> >> +#endif
> >> +	if (!printk_percpu_data_ready())
> >> +		return &printk_count_early[ctx];
> >> +	return &((*this_cpu_ptr(&printk_count))[ctx]);
> >> +}
> >
> > It is not a big deal. But using an array for two contexts looks strange
> > especially when only one is used on some architectures.
> > Also &((*this_cpu_ptr(&printk_count))[ctx]) is quite tricky ;-)
> >
> > What do you think about the following, please?
> >
> > static DEFINE_PER_CPU(u8 printk_count);
> > static u8 printk_count_early;
> >
> > #ifdef CONFIG_HAVE_NMI
> > static DEFINE_PER_CPU(u8 printk_count_nmi);
> > static u8 printk_count_nmi_early;
> > #endif
> >
> > static u8 *printk_recursion_counter(void)
> > {
> > 	if (IS_ENABLED(CONFIG_HAVE_NMI) && in_nmi()) {
> > 		if (printk_cpu_data_ready())
> > 			return this_cpu_ptr(&printk_count_nmi);
> > 		return printk_count_nmi_early;
> > 	}
> >
> > 	if (printk_cpu_data_ready())
> > 		return this_cpu_ptr(&printk_count);
> > 	return printk_count_early;
> > }
> 
> I can split it into explicit variables. But is the use of the IS_ENABLED
> macro preferred over ifdef? I would prefer:
> 
> static u8 *printk_recursion_counter(void)
> {
> #ifdef CONFIG_HAVE_NMI
> 	if (in_nmi()) {
> 		if (printk_cpu_data_ready())
> 			return this_cpu_ptr(&printk_count_nmi);
> 		return printk_count_nmi_early;
> 	}
> #endif
> 	if (printk_cpu_data_ready())
> 		return this_cpu_ptr(&printk_count);
> 	return printk_count_early;
> }
> 
> Since @printk_count_nmi and @printk_count_nmi_early would not exist, I
> would prefer the pre-processor removes that code block rather than
> relying on compiler optimization.

Feel free to use #ifdef.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH next v1 2/3] printk: remove safe buffers
  2021-03-23 10:47   ` Petr Mladek
@ 2021-03-26 11:12     ` John Ogness
  2021-03-29 10:04       ` Petr Mladek
  0 siblings, 1 reply; 23+ messages in thread
From: John Ogness @ 2021-03-26 11:12 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Steven Rostedt,
	Thomas Gleixner, linux-kernel, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, Eric Biederman,
	Nicholas Piggin, Christophe Leroy, Alistair Popple,
	Jordan Niethe, Peter Zijlstra, Cédric Le Goater,
	Andrew Morton, Kees Cook, Yue Hu, Alexey Kardashevskiy,
	Rafael Aquini, Tiezhu Yang, Guilherme G. Piccoli,
	Paul E. McKenney, linuxppc-dev, kexec

On 2021-03-23, Petr Mladek <pmladek@suse.com> wrote:
>> --- a/kernel/printk/printk.c
>> +++ b/kernel/printk/printk.c
>> @@ -1142,8 +1126,6 @@ void __init setup_log_buf(int early)
>>  		 new_descs, ilog2(new_descs_count),
>>  		 new_infos);
>>  
>> -	printk_safe_enter_irqsave(flags);
>> -
>>  	log_buf_len = new_log_buf_len;
>>  	log_buf = new_log_buf;
>>  	new_log_buf_len = 0;
>> @@ -1159,8 +1141,6 @@ void __init setup_log_buf(int early)
>>  	 */
>>  	prb = &printk_rb_dynamic;
>>  
>> -	printk_safe_exit_irqrestore(flags);
>
> This will allow to add new messages from the IRQ context when we
> are copying them to the new buffer. They might get lost in
> the small race window.
>
> Also the messages from NMI might get lost because they are not
> longer stored in the per-CPU buffer.
>
> A possible solution might be to do something like this:
>
> 	prb_for_each_record(0, &printk_rb_static, seq, &r)
> 		free -= add_to_rb(&printk_rb_dynamic, &r);
>
> 	prb = &printk_rb_dynamic;
>
> 	/*
> 	 * Copy the remaining messages that might have appeared
> 	 * from IRQ or NMI context after we ended copying and
> 	 * before we switched the buffers. They must be finalized
> 	 * because only one CPU is up at this stage.
> 	 */
> 	prb_for_each_record(seq, &printk_rb_static, seq, &r)
> 		free -= add_to_rb(&printk_rb_dynamic, &r);

OK. I'll probably rework it some and combine it with the "dropped" test
so that we can identify if messages were dropped during the transition
(because of static ringbuffer overrun).

>> -
>>  	if (seq != prb_next_seq(&printk_rb_static)) {
>>  		pr_err("dropped %llu messages\n",
>>  		       prb_next_seq(&printk_rb_static) - seq);
>> @@ -2666,7 +2631,6 @@ void console_unlock(void)
>>  		size_t ext_len = 0;
>>  		size_t len;
>>  
>> -		printk_safe_enter_irqsave(flags);
>>  skip:
>>  		if (!prb_read_valid(prb, console_seq, &r))
>>  			break;
>> @@ -2711,6 +2675,8 @@ void console_unlock(void)
>>  				printk_time);
>>  		console_seq++;
>>  
>> +		printk_safe_enter_irqsave(flags);
>
> What is the purpose of the printk_safe context here, please?

console_lock_spinning_enable() needs to be called with interrupts
disabled. I should have just used local_irq_save().

I could add local_irq_save() to console_lock_spinning_enable() and
restore them at the end of console_lock_spinning_disable_and_check(),
but then I would need to add a @flags argument to both functions. I
think it is simpler to just do the disable/enable from the caller,
console_unlock().

BTW, I could not find any sane way of disabling interrupts via a
raw_spin_lock_irqsave() of @console_owner_lock because of the how it is
used with lockdep. In particular for
console_lock_spinning_disable_and_check().

John Ogness

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH next v1 3/3] printk: convert @syslog_lock to spin_lock
  2021-03-23 12:01   ` Petr Mladek
@ 2021-03-26 11:23     ` John Ogness
  0 siblings, 0 replies; 23+ messages in thread
From: John Ogness @ 2021-03-26 11:23 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Steven Rostedt,
	Thomas Gleixner, linux-kernel

On 2021-03-23, Petr Mladek <pmladek@suse.com> wrote:

> On Wed 2021-03-17 00:33:26, John Ogness wrote:
>> @syslog_log was a raw_spin_lock to simplify the transition of
>
> s/syslog_log/syslog_lock/
>
> Same problem is also below.

Right.

>> removing @logbuf_lock and the safe buffers. With that transition
>> complete, @syslog_log can become a spin_lock.
>
> I know that we already talked about this. But I want to be sure
> that this patch makes sense.
>
> It will actually not change the behavior because we always
> take the lock with interrupts disabled.
>
> We disable the interrupts because register_console() is called
> in IRQ context on parisc in handle_interruption() when it is
> going to panic (code == 1 => will call parisc_terminate()).

Yes. [0]

> Disabling IRQ will not help in parisc_terminate(). This code
> path is non-maskable and never returns. The deadlock might be
> prevented only by trylock.
>
> trylock on syslog_lock is only small problem. Much bigger is
> is a deadlock on console_lock. Fixing this is beyond this
> patchset.
>
> Summary:
>
>     + disabling IRQ does not help for parisc
>
>     + register_console() is not irq safe in general because
>       of the sleeping console_lock.
>
>
> I would personally prefer to remove both "raw" and "irq"
> in this patch and just document the problem with parisc
> in the commit message.
>
> IMHO, it does not make sense to keep _irq when it neither
> helps nor makes sense.

I agree. I will change it for v2 and note in the commit message that the
parisc call chain:

handle_interruption(code=1) /* High-priority machine check (HPMC) */
  pdc_console_restart()
    pdc_console_init_force()
      register_console()

is unsafe and is the only register_console() user in atomic context.

John Ogness

[0] https://lore.kernel.org/lkml/8735xs10hi.fsf@jogness.linutronix.de

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH next v1 2/3] printk: remove safe buffers
  2021-03-26 11:12     ` John Ogness
@ 2021-03-29 10:04       ` Petr Mladek
  2021-03-29 15:10         ` John Ogness
  0 siblings, 1 reply; 23+ messages in thread
From: Petr Mladek @ 2021-03-29 10:04 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Steven Rostedt,
	Thomas Gleixner, linux-kernel, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, Eric Biederman,
	Nicholas Piggin, Christophe Leroy, Alistair Popple,
	Jordan Niethe, Peter Zijlstra, Cédric Le Goater,
	Andrew Morton, Kees Cook, Yue Hu, Alexey Kardashevskiy,
	Rafael Aquini, Tiezhu Yang, Guilherme G. Piccoli,
	Paul E. McKenney, linuxppc-dev, kexec

On Fri 2021-03-26 12:12:37, John Ogness wrote:
> On 2021-03-23, Petr Mladek <pmladek@suse.com> wrote:
> >> --- a/kernel/printk/printk.c
> >> +++ b/kernel/printk/printk.c
> >> -
> >>  	if (seq != prb_next_seq(&printk_rb_static)) {
> >>  		pr_err("dropped %llu messages\n",
> >>  		       prb_next_seq(&printk_rb_static) - seq);
> >> @@ -2666,7 +2631,6 @@ void console_unlock(void)
> >>  		size_t ext_len = 0;
> >>  		size_t len;
> >>  
> >> -		printk_safe_enter_irqsave(flags);
> >>  skip:
> >>  		if (!prb_read_valid(prb, console_seq, &r))
> >>  			break;
> >> @@ -2711,6 +2675,8 @@ void console_unlock(void)
> >>  				printk_time);
> >>  		console_seq++;
> >>  
> >> +		printk_safe_enter_irqsave(flags);
> >
> > What is the purpose of the printk_safe context here, please?
> 
> console_lock_spinning_enable() needs to be called with interrupts
> disabled. I should have just used local_irq_save().
> 
> I could add local_irq_save() to console_lock_spinning_enable() and
> restore them at the end of console_lock_spinning_disable_and_check(),
> but then I would need to add a @flags argument to both functions. I
> think it is simpler to just do the disable/enable from the caller,
> console_unlock().

I see. I have missed it that all this code have to be called with
interrupts disabled.

OK, it is a must-to-have because of the spinning. But I wonder if some
console drivers rely on the fact that the write() callback is
called with interrupts disabled.

IMHO, it would be a bug when any write() callback expects that
callers disabled the interrupts.

Do you plan to remove the console-spinning stuff after offloading
consoles to the kthreads?

Will you call console write() callback with irq enabled from
the kthread?

Anyway, we should at least add a comment why the interrupts are
disabled.


> BTW, I could not find any sane way of disabling interrupts via a
> raw_spin_lock_irqsave() of @console_owner_lock because of the how it is
> used with lockdep. In particular for
> console_lock_spinning_disable_and_check().

I see. IMHO, we would need to explicitly call local_irq_save()/restore()
if we moved them to console_lock_spinning_enable()/disable_and_check().
I mean to do:


static void console_lock_spinning_enable(unsigned long *flags)
{
	local_irq_save(*flags);

	raw_spin_lock(&console_owner_lock);
	console_owner = current;
	raw_spin_unlock(&console_owner_lock);

	/* The waiter may spin on us after setting console_owner */
	spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);
}

...

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH next v1 2/3] printk: remove safe buffers
  2021-03-29 10:04       ` Petr Mladek
@ 2021-03-29 15:10         ` John Ogness
  2021-03-29 15:13           ` John Ogness
  0 siblings, 1 reply; 23+ messages in thread
From: John Ogness @ 2021-03-29 15:10 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Steven Rostedt,
	Thomas Gleixner, linux-kernel, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, Eric Biederman,
	Nicholas Piggin, Christophe Leroy, Alistair Popple,
	Jordan Niethe, Peter Zijlstra, Cédric Le Goater,
	Andrew Morton, Kees Cook, Yue Hu, Alexey Kardashevskiy,
	Rafael Aquini, Tiezhu Yang, Guilherme G. Piccoli,
	Paul E. McKenney, linuxppc-dev, kexec

On 2021-03-29, Petr Mladek <pmladek@suse.com> wrote:
> I wonder if some console drivers rely on the fact that the write()
> callback is called with interrupts disabled.
>
> IMHO, it would be a bug when any write() callback expects that
> callers disabled the interrupts.

Agreed.

> Do you plan to remove the console-spinning stuff after offloading
> consoles to the kthreads?

Yes. Although a similar concept will be introduced to allow the threaded
printers and the atomic consoles to compete.

> Will you call console write() callback with irq enabled from the
> kthread?

No. That defeats the fundamental purpose of this entire rework
excercise. ;-)

> Anyway, we should at least add a comment why the interrupts are
> disabled.

I decided to move the local_irq_save/restore inside the console-spinning
functions and added a comment for v2.

John Ogness

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH next v1 2/3] printk: remove safe buffers
  2021-03-29 15:10         ` John Ogness
@ 2021-03-29 15:13           ` John Ogness
  0 siblings, 0 replies; 23+ messages in thread
From: John Ogness @ 2021-03-29 15:13 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Steven Rostedt,
	Thomas Gleixner, linux-kernel, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, Eric Biederman,
	Nicholas Piggin, Christophe Leroy, Alistair Popple,
	Jordan Niethe, Peter Zijlstra, Cédric Le Goater,
	Andrew Morton, Kees Cook, Yue Hu, Alexey Kardashevskiy,
	Rafael Aquini, Tiezhu Yang, Guilherme G. Piccoli,
	Paul E. McKenney, linuxppc-dev, kexec

On 2021-03-29, John Ogness <john.ogness@linutronix.de> wrote:
>> Will you call console write() callback with irq enabled from the
>> kthread?
>
> No. That defeats the fundamental purpose of this entire rework
> excercise. ;-)

Sorry, I misread your question. The answer is "yes". We want to avoid a
local_irq_save() when calling into console->write().

John Ogness

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2021-03-29 15:14 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-16 23:33 [PATCH next v1 0/3] printk: remove safe buffers John Ogness
2021-03-16 23:33 ` [PATCH next v1 1/3] printk: track/limit recursion John Ogness
2021-03-21  5:34   ` Sergey Senozhatsky
2021-03-22 10:53     ` John Ogness
2021-03-22 11:13       ` Sergey Senozhatsky
2021-03-22 15:07         ` Petr Mladek
2021-03-22 14:49   ` Petr Mladek
2021-03-23 21:32     ` John Ogness
2021-03-24  8:41       ` Petr Mladek
2021-03-16 23:33 ` [PATCH next v1 2/3] printk: remove safe buffers John Ogness
2021-03-21  5:26   ` Sergey Senozhatsky
2021-03-22 11:16     ` John Ogness
2021-03-22 18:02       ` Petr Mladek
2021-03-22 21:58         ` John Ogness
2021-03-23  9:46           ` Petr Mladek
2021-03-23 10:47   ` Petr Mladek
2021-03-26 11:12     ` John Ogness
2021-03-29 10:04       ` Petr Mladek
2021-03-29 15:10         ` John Ogness
2021-03-29 15:13           ` John Ogness
2021-03-16 23:33 ` [PATCH next v1 3/3] printk: convert @syslog_lock to spin_lock John Ogness
2021-03-23 12:01   ` Petr Mladek
2021-03-26 11:23     ` John Ogness

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).