[next,v4,1/2] lib/dump_stack: move cpu lock to printk.c
diff mbox series

Message ID 20210617095051.4808-2-john.ogness@linutronix.de
State New, archived
Headers show
Series
  • introduce printk cpu lock
Related show

Commit Message

John Ogness June 17, 2021, 9:50 a.m. UTC
dump_stack() implements its own cpu-reentrant spinning lock to
best-effort serialize stack traces in the printk log. However,
there are other functions (such as show_regs()) that can also
benefit from this serialization.

Move the cpu-reentrant spinning lock (cpu lock) into new helper
functions printk_cpu_lock_irqsave()/printk_cpu_unlock_irqrestore()
so that it is available for others as well. For !CONFIG_SMP the
cpu lock is a NOP.

Note that having multiple cpu locks in the system can easily
lead to deadlock. Code needing a cpu lock should use the
printk cpu lock, since the printk cpu lock could be acquired
from any code and any context.

Also note that it is not necessary for a cpu lock to disable
interrupts. However, in upcoming work this cpu lock will be used
for emergency tasks (for example, atomic consoles during kernel
crashes) and any interruptions while holding the cpu lock should
be avoided if possible.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
 include/linux/printk.h | 41 +++++++++++++++++++++++++
 kernel/printk/printk.c | 69 ++++++++++++++++++++++++++++++++++++++++++
 lib/dump_stack.c       | 38 ++---------------------
 3 files changed, 112 insertions(+), 36 deletions(-)

Comments

Steven Rostedt June 17, 2021, 1:32 p.m. UTC | #1
On Thu, 17 Jun 2021 11:56:50 +0206
John Ogness <john.ogness@linutronix.de> wrote:

> dump_stack() implements its own cpu-reentrant spinning lock to
> best-effort serialize stack traces in the printk log. However,
> there are other functions (such as show_regs()) that can also
> benefit from this serialization.
> 
> Move the cpu-reentrant spinning lock (cpu lock) into new helper
> functions printk_cpu_lock_irqsave()/printk_cpu_unlock_irqrestore()
> so that it is available for others as well. For !CONFIG_SMP the
> cpu lock is a NOP.
> 
> Note that having multiple cpu locks in the system can easily
> lead to deadlock. Code needing a cpu lock should use the
> printk cpu lock, since the printk cpu lock could be acquired
> from any code and any context.
> 
> Also note that it is not necessary for a cpu lock to disable
> interrupts. However, in upcoming work this cpu lock will be used
> for emergency tasks (for example, atomic consoles during kernel
> crashes) and any interruptions while holding the cpu lock should
> be avoided if possible.
> 
> Signed-off-by: John Ogness <john.ogness@linutronix.de>
> ---

Can we add this lock to early_printk() ?

This would make early_printk() so much more readable.

-- Steve

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 421c35571797..2b749c745c1f 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2259,6 +2259,7 @@ struct console *early_console;
 
 asmlinkage __visible void early_printk(const char *fmt, ...)
 {
+	unsigned long flags;
 	va_list ap;
 	char buf[512];
 	int n;
@@ -2270,7 +2271,9 @@ asmlinkage __visible void early_printk(const char *fmt, ...)
 	n = vscnprintf(buf, sizeof(buf), fmt, ap);
 	va_end(ap);
 
+	printk_cpu_lock_irqsave(flags);
 	early_console->write(early_console, buf, n);
+	printk_cpu_unlock_irqrestore(flags);
 }
 #endif
Petr Mladek June 18, 2021, 2:47 p.m. UTC | #2
On Thu 2021-06-17 09:32:43, Steven Rostedt wrote:
> On Thu, 17 Jun 2021 11:56:50 +0206
> John Ogness <john.ogness@linutronix.de> wrote:
> 
> > dump_stack() implements its own cpu-reentrant spinning lock to
> > best-effort serialize stack traces in the printk log. However,
> > there are other functions (such as show_regs()) that can also
> > benefit from this serialization.
> > 
> > Move the cpu-reentrant spinning lock (cpu lock) into new helper
> > functions printk_cpu_lock_irqsave()/printk_cpu_unlock_irqrestore()
> > so that it is available for others as well. For !CONFIG_SMP the
> > cpu lock is a NOP.
> > 
> > Note that having multiple cpu locks in the system can easily
> > lead to deadlock. Code needing a cpu lock should use the
> > printk cpu lock, since the printk cpu lock could be acquired
> > from any code and any context.
> > 
> > Also note that it is not necessary for a cpu lock to disable
> > interrupts. However, in upcoming work this cpu lock will be used
> > for emergency tasks (for example, atomic consoles during kernel
> > crashes) and any interruptions while holding the cpu lock should
> > be avoided if possible.
> > 
> > Signed-off-by: John Ogness <john.ogness@linutronix.de>
> > ---
> 
> Can we add this lock to early_printk() ?
> 
> This would make early_printk() so much more readable.

Good point! Just to be sure. Do you see the messed output with plain
kernel? Or do you need the extra patches (from Peter Zijlstra) that
redirect normal printk() to early_printk()?

My understanding is that early_printk() is used only for very early
boot message in plain kernel. And that there is not much concurrency
at that time.

That said. I always wanted to upstream Peter's patchset. But I never
found time to clean it up.

Best Regards,
Petr
John Ogness June 18, 2021, 2:55 p.m. UTC | #3
On 2021-06-17, Steven Rostedt <rostedt@goodmis.org> wrote:
> Can we add this lock to early_printk() ?
>
> This would make early_printk() so much more readable.
>
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 421c35571797..2b749c745c1f 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2259,6 +2259,7 @@ struct console *early_console;
>  
>  asmlinkage __visible void early_printk(const char *fmt, ...)
>  {
> +	unsigned long flags;
>  	va_list ap;
>  	char buf[512];
>  	int n;
> @@ -2270,7 +2271,9 @@ asmlinkage __visible void early_printk(const char *fmt, ...)
>  	n = vscnprintf(buf, sizeof(buf), fmt, ap);
>  	va_end(ap);
>  
> +	printk_cpu_lock_irqsave(flags);
>  	early_console->write(early_console, buf, n);
> +	printk_cpu_unlock_irqrestore(flags);
>  }
>  #endif

Since the cpu lock is also taken in NMI context (for example, via
nmi_cpu_backtrace()/dump_stack()), the main concerns are:

1. locks that are taken by a CPU that is holding the cpu lock

2. NMI contexts that take any type of lock

(Actually, #2 is just a special case of #1 where an NMI interrupted a
task that was holding the cpu lock.)

For early_printk() the early USB devices look to be a
problem. early_xdbc_write() will take a spinlock. Assuming the
early_console was also registered as a normal console (via "keep") we
could end up in the following deadlock between the normal console and
early_printk() writes:

    CPU0                          CPU1
    ----                          ----
    early_printk()                console->write()
      cpu_lock()                    spinlock()
      early_console->write()      *NMI*
        spinlock()                cpu_lock()

The upcoming atomic console work addresses this by implementing a new
write_atomic() callback that is lockless (and SMP-safe) or aware of the
cpu lock to avoid dead locks such as above.

AFAICT, the USB devices (CONFIG_EARLY_PRINTK_USB) are the only
early_printk() candidates that use locking. So for all other
early_printk() implementations I think your suggestion would work fine.

Although, in general, early_printk() is not SMP-safe. So I'm not sure
how much safety we need to include at this point.

John Ogness
Steven Rostedt June 18, 2021, 4:25 p.m. UTC | #4
On Fri, 18 Jun 2021 16:47:39 +0200
Petr Mladek <pmladek@suse.com> wrote:

> Good point! Just to be sure. Do you see the messed output with plain
> kernel? Or do you need the extra patches (from Peter Zijlstra) that
> redirect normal printk() to early_printk()?

I sometimes use this with Peter's patches, which also do basically the
same thing.

> 
> My understanding is that early_printk() is used only for very early
> boot message in plain kernel. And that there is not much concurrency
> at that time.

It will continue if you use ",keep" option. And that is something I
have done without Peter's patches, but then they become illegible when
there's a bug if more than one CPU triggers.

> 
> That said. I always wanted to upstream Peter's patchset. But I never
> found time to clean it up.

That would be great too!

-- Steve
Steven Rostedt June 18, 2021, 4:31 p.m. UTC | #5
On Fri, 18 Jun 2021 17:01:37 +0206
John Ogness <john.ogness@linutronix.de> wrote:

> Since the cpu lock is also taken in NMI context (for example, via
> nmi_cpu_backtrace()/dump_stack()), the main concerns are:
> 
> 1. locks that are taken by a CPU that is holding the cpu lock
> 
> 2. NMI contexts that take any type of lock
> 
> (Actually, #2 is just a special case of #1 where an NMI interrupted a
> task that was holding the cpu lock.)
> 
> For early_printk() the early USB devices look to be a
> problem. early_xdbc_write() will take a spinlock. Assuming the
> early_console was also registered as a normal console (via "keep") we
> could end up in the following deadlock between the normal console and
> early_printk() writes:

My use case of earlyprintk is to avoid all the crud in printk, and
just have something to print to the serial console. USB early printk is
not my use case, as that adds even more crud.

Thus, I don't care about this being an issue with USB early printk ;-)


> 
>     CPU0                          CPU1
>     ----                          ----
>     early_printk()                console->write()
>       cpu_lock()                    spinlock()
>       early_console->write()      *NMI*
>         spinlock()                cpu_lock()
> 
> The upcoming atomic console work addresses this by implementing a new
> write_atomic() callback that is lockless (and SMP-safe) or aware of the
> cpu lock to avoid dead locks such as above.
> 
> AFAICT, the USB devices (CONFIG_EARLY_PRINTK_USB) are the only
> early_printk() candidates that use locking. So for all other
> early_printk() implementations I think your suggestion would work fine.
> 
> Although, in general, early_printk() is not SMP-safe. So I'm not sure
> how much safety we need to include at this point.
> 

Right. I say we ignore the issue with usb earlyprintk. The only time I
can see that even useful, is for issues that are happening before
printk is initialized, and all you have for a console is the usb for
early printk, and the above scenario shouldn't be an issue. But for
places that would utilize early printk for later on in the boot process
(and normal activity), any console that takes locks would be useless.

-- Steve
John Ogness June 19, 2021, 12:22 a.m. UTC | #6
On 2021-06-18, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Fri, 18 Jun 2021 16:47:39 +0200 Petr Mladek <pmladek@suse.com> wrote:
>> My understanding is that early_printk() is used only for very early
>> boot message in plain kernel. And that there is not much concurrency
>> at that time.
>
> It will continue if you use ",keep" option. And that is something I
> have done without Peter's patches, but then they become illegible when
> there's a bug if more than one CPU triggers.

Note that using "keep" to force some boot consoles to regular consoles
does not mean that early_printk() is used. Your suggestion of adding the
cpu lock to early_printk() will not help for the "keep" scenario.

>> That said. I always wanted to upstream Peter's patchset. But I never
>> found time to clean it up.

The main problem with Peter's patchset (aside from using unsafe
early_printk code in a preemptive environment) is that it does not
handle CONT messages. While I'm sure this is fine for Peter, it is quite
horrible for things like lockdep output.

It would not be too much work to implement an early_printk printing loop
as a reader of the ringbuffer. But that is basically what the upcoming
atomic console and sync mode will be doing. Once that code is available,
it would be trivial to allow early_printk usage if an atomic console is
not available.

My recommendation right now is wait a bit longer on the early_printk
hack. We are not far the atomic console and sync mode series (which come
immediately after the safe buffer removal series that we are currently
pushing).

John Ogness

Patch
diff mbox series

diff --git a/include/linux/printk.h b/include/linux/printk.h
index f589b8b60806..d796183f26c9 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -287,6 +287,47 @@  static inline void printk_safe_flush_on_panic(void)
 }
 #endif
 
+#ifdef CONFIG_SMP
+extern int __printk_cpu_trylock(void);
+extern void __printk_wait_on_cpu_lock(void);
+extern void __printk_cpu_unlock(void);
+
+/**
+ * printk_cpu_lock_irqsave() - Acquire the printk cpu-reentrant spinning
+ *                             lock and disable interrupts.
+ * @flags: Stack-allocated storage for saving local interrupt state,
+ *         to be passed to printk_cpu_unlock_irqrestore().
+ *
+ * If the lock is owned by another CPU, spin until it becomes available.
+ * Interrupts are restored while spinning.
+ */
+#define printk_cpu_lock_irqsave(flags)		\
+	for (;;) {				\
+		local_irq_save(flags);		\
+		if (__printk_cpu_trylock())	\
+			break;			\
+		local_irq_restore(flags);	\
+		__printk_wait_on_cpu_lock();	\
+	}
+
+/**
+ * printk_cpu_unlock_irqrestore() - Release the printk cpu-reentrant spinning
+ *                                  lock and restore interrupts.
+ * @flags: Caller's saved interrupt state, from printk_cpu_lock_irqsave().
+ */
+#define printk_cpu_unlock_irqrestore(flags)	\
+	do {					\
+		__printk_cpu_unlock();		\
+		local_irq_restore(flags);	\
+	} while (0)				\
+
+#else
+
+#define printk_cpu_lock_irqsave(flags) ((void)flags)
+#define printk_cpu_unlock_irqrestore(flags) ((void)flags)
+
+#endif /* CONFIG_SMP */
+
 extern int kptr_restrict;
 
 /**
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 114e9963f903..08e14a67c44e 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -3532,3 +3532,72 @@  void kmsg_dump_rewind(struct kmsg_dump_iter *iter)
 EXPORT_SYMBOL_GPL(kmsg_dump_rewind);
 
 #endif
+
+#ifdef CONFIG_SMP
+static atomic_t printk_cpulock_owner = ATOMIC_INIT(-1);
+static atomic_t printk_cpulock_nested = ATOMIC_INIT(0);
+
+/**
+ * __printk_wait_on_cpu_lock() - Busy wait until the printk cpu-reentrant
+ *                               spinning lock is not owned by any CPU.
+ *
+ * Context: Any context.
+ */
+void __printk_wait_on_cpu_lock(void)
+{
+	do {
+		cpu_relax();
+	} while (atomic_read(&printk_cpulock_owner) != -1);
+}
+EXPORT_SYMBOL(__printk_wait_on_cpu_lock);
+
+/**
+ * __printk_cpu_trylock() - Try to acquire the printk cpu-reentrant
+ *                          spinning lock.
+ *
+ * If no processor has the lock, the calling processor takes the lock and
+ * becomes the owner. If the calling processor is already the owner of the
+ * lock, this function succeeds immediately.
+ *
+ * Context: Any context. Expects interrupts to be disabled.
+ * Return: 1 on success, otherwise 0.
+ */
+int __printk_cpu_trylock(void)
+{
+	int cpu;
+	int old;
+
+	cpu = smp_processor_id();
+
+	old = atomic_cmpxchg(&printk_cpulock_owner, -1, cpu);
+	if (old == -1) {
+		/* This CPU is now the owner. */
+		return 1;
+	} else if (old == cpu) {
+		/* This CPU is already the owner. */
+		atomic_inc(&printk_cpulock_nested);
+		return 1;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL(__printk_cpu_trylock);
+
+/**
+ * __printk_cpu_unlock() - Release the printk cpu-reentrant spinning lock.
+ *
+ * The calling processor must be the owner of the lock.
+ *
+ * Context: Any context. Expects interrupts to be disabled.
+ */
+void __printk_cpu_unlock(void)
+{
+	if (atomic_read(&printk_cpulock_nested)) {
+		atomic_dec(&printk_cpulock_nested);
+		return;
+	}
+
+	atomic_set(&printk_cpulock_owner, -1);
+}
+EXPORT_SYMBOL(__printk_cpu_unlock);
+#endif /* CONFIG_SMP */
diff --git a/lib/dump_stack.c b/lib/dump_stack.c
index 6e7ca3d67710..cd3387bb34e5 100644
--- a/lib/dump_stack.c
+++ b/lib/dump_stack.c
@@ -93,52 +93,18 @@  static void __dump_stack(const char *log_lvl)
  *
  * Architectures can override this implementation by implementing its own.
  */
-#ifdef CONFIG_SMP
-static atomic_t dump_lock = ATOMIC_INIT(-1);
-
 asmlinkage __visible void dump_stack_lvl(const char *log_lvl)
 {
 	unsigned long flags;
-	int was_locked;
-	int old;
-	int cpu;
 
 	/*
 	 * Permit this cpu to perform nested stack dumps while serialising
 	 * against other CPUs
 	 */
-retry:
-	local_irq_save(flags);
-	cpu = smp_processor_id();
-	old = atomic_cmpxchg(&dump_lock, -1, cpu);
-	if (old == -1) {
-		was_locked = 0;
-	} else if (old == cpu) {
-		was_locked = 1;
-	} else {
-		local_irq_restore(flags);
-		/*
-		 * Wait for the lock to release before jumping to
-		 * atomic_cmpxchg() in order to mitigate the thundering herd
-		 * problem.
-		 */
-		do { cpu_relax(); } while (atomic_read(&dump_lock) != -1);
-		goto retry;
-	}
-
-	__dump_stack(log_lvl);
-
-	if (!was_locked)
-		atomic_set(&dump_lock, -1);
-
-	local_irq_restore(flags);
-}
-#else
-asmlinkage __visible void dump_stack_lvl(const char *log_lvl)
-{
+	printk_cpu_lock_irqsave(flags);
 	__dump_stack(log_lvl);
+	printk_cpu_unlock_irqrestore(flags);
 }
-#endif
 EXPORT_SYMBOL(dump_stack_lvl);
 
 asmlinkage __visible void dump_stack(void)