All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH next v3 0/2] introduce printk cpu lock
@ 2021-06-15 17:49 John Ogness
  2021-06-15 17:49 ` [PATCH next v3 1/2] dump_stack: move cpu lock to printk.c John Ogness
  2021-06-15 17:49 ` [PATCH next v3 2/2] printk: fix cpu lock ordering John Ogness
  0 siblings, 2 replies; 10+ messages in thread
From: John Ogness @ 2021-06-15 17:49 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Stephen Rothwell, Andrew Morton,
	Daniel Bristot de Oliveira, Stephen Boyd, Alexander Potapenko,
	Peter Zijlstra, Paul E. McKenney

Hello,

While working on removing the safe buffers for printk [0] we
stumbled on a cpu-reentrant spinning lock used by dump_stack(). This
type of lock (dubbed a cpu lock) will cause deadlock risks once we
introduce atomic consoles because atomic consoles also need such a
lock.

Although we are not yet ready to introduce atomic consoles, this is
an appropriate time to provide an official cpu lock to be used for
all things relating to printk (including the atomic consoles, once
they are introduced).

This series is against next-20210615.

v2 can be found here [1].

Changes since v2:

- Remove @lock_flag argument from lock/unlock functions. Since there
  is only 1 global printk cpu lock, the @lock_flag can be a static
  global variable inside printk.c. Also, that flag is renamed to
  @printk_cpulock_nested.

- Use typical irqsave/irqrestore calling conventions. This means
  changing the lock/unlock functions to macros and set the irq flags
  within those macros, rather than passing the irq flags as a
  function argument.

- Split the lock function into 2 functions: trylock and wait. This
  simplifies the macro implementation. Later, for atomic consoles,
  this will be useful because atomic consoles will implement the
  waiting component differently.

- Fix a "used uninitialized" warning for !CONFIG_SMP, reported by
  the kbuild robot.

- Extend memory barrier comments to cover both "previous to current"
  and "current to next" guarantees.

John Ogness

[0] https://lore.kernel.org/lkml/YGW63%2FelFr%2FgYW1u@alley
[1] https://lore.kernel.org/lkml/20210607200232.22211-1-john.ogness@linutronix.de

John Ogness (2):
  dump_stack: move cpu lock to printk.c
  printk: fix cpu lock ordering

 include/linux/printk.h |  41 +++++++++++++++
 kernel/printk/printk.c | 114 +++++++++++++++++++++++++++++++++++++++++
 lib/dump_stack.c       |  38 +-------------
 3 files changed, 157 insertions(+), 36 deletions(-)

--
2.20.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH next v3 1/2] dump_stack: move cpu lock to printk.c
  2021-06-15 17:49 [PATCH next v3 0/2] introduce printk cpu lock John Ogness
@ 2021-06-15 17:49 ` John Ogness
  2021-06-15 21:33   ` John Ogness
  2021-06-15 17:49 ` [PATCH next v3 2/2] printk: fix cpu lock ordering John Ogness
  1 sibling, 1 reply; 10+ messages in thread
From: John Ogness @ 2021-06-15 17:49 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Stephen Rothwell, Andrew Morton,
	Daniel Bristot de Oliveira, Stephen Boyd, Alexander Potapenko,
	Peter Zijlstra

dump_stack() implements its own cpu-reentrant spinning lock to
best-effort serialize stack traces in the printk log. However,
there are other functions (such as show_regs()) that can also
benefit from this serialization.

Move the cpu-reentrant spinning lock (cpu lock) into new helper
functions printk_cpu_lock_irqsave()/printk_cpu_unlock_irqrestore()
so that it is available for others as well. For !CONFIG_SMP the
cpu lock is a NOP.

Note that having multiple cpu locks in the system can easily
lead to deadlock. Code needing a cpu lock should use the
printk cpu lock, since the printk cpu lock could be acquired
from any code and any context.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
 include/linux/printk.h | 41 ++++++++++++++++++++++++++
 kernel/printk/printk.c | 67 ++++++++++++++++++++++++++++++++++++++++++
 lib/dump_stack.c       | 38 ++----------------------
 3 files changed, 110 insertions(+), 36 deletions(-)

diff --git a/include/linux/printk.h b/include/linux/printk.h
index f589b8b60806..d796183f26c9 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -287,6 +287,47 @@ static inline void printk_safe_flush_on_panic(void)
 }
 #endif
 
+#ifdef CONFIG_SMP
+extern int __printk_cpu_trylock(void);
+extern void __printk_wait_on_cpu_lock(void);
+extern void __printk_cpu_unlock(void);
+
+/**
+ * printk_cpu_lock_irqsave() - Acquire the printk cpu-reentrant spinning
+ *                             lock and disable interrupts.
+ * @flags: Stack-allocated storage for saving local interrupt state,
+ *         to be passed to printk_cpu_unlock_irqrestore().
+ *
+ * If the lock is owned by another CPU, spin until it becomes available.
+ * Interrupts are restored while spinning.
+ */
+#define printk_cpu_lock_irqsave(flags)		\
+	for (;;) {				\
+		local_irq_save(flags);		\
+		if (__printk_cpu_trylock())	\
+			break;			\
+		local_irq_restore(flags);	\
+		__printk_wait_on_cpu_lock();	\
+	}
+
+/**
+ * printk_cpu_unlock_irqrestore() - Release the printk cpu-reentrant spinning
+ *                                  lock and restore interrupts.
+ * @flags: Caller's saved interrupt state, from printk_cpu_lock_irqsave().
+ */
+#define printk_cpu_unlock_irqrestore(flags)	\
+	do {					\
+		__printk_cpu_unlock();		\
+		local_irq_restore(flags);	\
+	} while (0)				\
+
+#else
+
+#define printk_cpu_lock_irqsave(flags) ((void)flags)
+#define printk_cpu_unlock_irqrestore(flags) ((void)flags)
+
+#endif /* CONFIG_SMP */
+
 extern int kptr_restrict;
 
 /**
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 114e9963f903..5369d8f33299 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -3532,3 +3532,70 @@ void kmsg_dump_rewind(struct kmsg_dump_iter *iter)
 EXPORT_SYMBOL_GPL(kmsg_dump_rewind);
 
 #endif
+
+#ifdef CONFIG_SMP
+static atomic_t printk_cpulock_owner = ATOMIC_INIT(-1);
+static bool printk_cpulock_nested;
+
+/**
+ * __printk_wait_on_cpu_lock() - Busy wait until the printk cpu-reentrant
+ *                               spinning lock is not owned by any CPU.
+ */
+void __printk_wait_on_cpu_lock(void)
+{
+	do {
+		cpu_relax();
+	} while (atomic_read(&printk_cpulock_owner) != -1);
+}
+EXPORT_SYMBOL(__printk_wait_on_cpu_lock);
+
+/**
+ * __printk_cpu_trylock() - Try to acquire the printk cpu-reentrant
+ *                          spinning lock.
+ *
+ * If no processor has the lock, the calling processor takes the lock and
+ * becomes the owner. If the calling processor is already the owner of the
+ * lock, this function succeeds immediately.
+ *
+ * Context: Any context. Expects interrupts to be disabled.
+ * Return: 1 on success, otherwise 0.
+ */
+int __printk_cpu_trylock(void)
+{
+	int cpu;
+	int old;
+
+	cpu = smp_processor_id();
+
+	old = atomic_cmpxchg(&printk_cpulock_owner, -1, cpu);
+	if (old == -1) {
+		/* This CPU is now the owner. */
+		return 1;
+	} else if (old == cpu) {
+		/* This CPU is already the owner. */
+		printk_cpulock_nested = true;
+		return 1;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL(__printk_cpu_trylock);
+
+/**
+ * __printk_cpu_unlock() - Release the printk cpu-reentrant spinning lock.
+ *
+ * The calling processor must be the owner of the lock.
+ *
+ * Context: Any context. Expects interrupts to be disabled.
+ */
+void __printk_cpu_unlock(void)
+{
+	if (printk_cpulock_nested) {
+		printk_cpulock_nested = false;
+		return;
+	}
+
+	atomic_set_release(&printk_cpulock_owner, -1);
+}
+EXPORT_SYMBOL(__printk_cpu_unlock);
+#endif /* CONFIG_SMP */
diff --git a/lib/dump_stack.c b/lib/dump_stack.c
index 6e7ca3d67710..cd3387bb34e5 100644
--- a/lib/dump_stack.c
+++ b/lib/dump_stack.c
@@ -93,52 +93,18 @@ static void __dump_stack(const char *log_lvl)
  *
  * Architectures can override this implementation by implementing its own.
  */
-#ifdef CONFIG_SMP
-static atomic_t dump_lock = ATOMIC_INIT(-1);
-
 asmlinkage __visible void dump_stack_lvl(const char *log_lvl)
 {
 	unsigned long flags;
-	int was_locked;
-	int old;
-	int cpu;
 
 	/*
 	 * Permit this cpu to perform nested stack dumps while serialising
 	 * against other CPUs
 	 */
-retry:
-	local_irq_save(flags);
-	cpu = smp_processor_id();
-	old = atomic_cmpxchg(&dump_lock, -1, cpu);
-	if (old == -1) {
-		was_locked = 0;
-	} else if (old == cpu) {
-		was_locked = 1;
-	} else {
-		local_irq_restore(flags);
-		/*
-		 * Wait for the lock to release before jumping to
-		 * atomic_cmpxchg() in order to mitigate the thundering herd
-		 * problem.
-		 */
-		do { cpu_relax(); } while (atomic_read(&dump_lock) != -1);
-		goto retry;
-	}
-
-	__dump_stack(log_lvl);
-
-	if (!was_locked)
-		atomic_set(&dump_lock, -1);
-
-	local_irq_restore(flags);
-}
-#else
-asmlinkage __visible void dump_stack_lvl(const char *log_lvl)
-{
+	printk_cpu_lock_irqsave(flags);
 	__dump_stack(log_lvl);
+	printk_cpu_unlock_irqrestore(flags);
 }
-#endif
 EXPORT_SYMBOL(dump_stack_lvl);
 
 asmlinkage __visible void dump_stack(void)
-- 
2.20.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH next v3 2/2] printk: fix cpu lock ordering
  2021-06-15 17:49 [PATCH next v3 0/2] introduce printk cpu lock John Ogness
  2021-06-15 17:49 ` [PATCH next v3 1/2] dump_stack: move cpu lock to printk.c John Ogness
@ 2021-06-15 17:49 ` John Ogness
  2021-06-16 11:30   ` Petr Mladek
  1 sibling, 1 reply; 10+ messages in thread
From: John Ogness @ 2021-06-15 17:49 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Paul E. McKenney

The cpu lock implementation uses a full memory barrier to take
the lock, but no memory barriers when releasing the lock. This
means that changes performed by a lock owner may not be seen by
the next lock owner. This may have been "good enough" for use
by dump_stack() as a serialization mechanism, but it is not
enough to provide proper protection for a critical section.

Correct this problem by using acquire/release memory barriers
for lock/unlock, respectively.

Note that it is not necessary for a cpu lock to disable
interrupts. However, in upcoming work this cpu lock will be used
for emergency tasks (for example, atomic consoles during kernel
crashes) and any interruptions while holding the cpu lock should
be avoided if possible.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
 kernel/printk/printk.c | 53 +++++++++++++++++++++++++++++++++++++++---
 1 file changed, 50 insertions(+), 3 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 5369d8f33299..e67dc510fa1b 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -3567,10 +3567,33 @@ int __printk_cpu_trylock(void)
 
 	cpu = smp_processor_id();
 
-	old = atomic_cmpxchg(&printk_cpulock_owner, -1, cpu);
+	/*
+	 * Guarantee loads and stores from this CPU when it is the lock owner
+	 * are _not_ visible to the previous lock owner. This pairs with
+	 * __printk_cpu_unlock:B.
+	 *
+	 * Memory barrier involvement:
+	 *
+	 * If __printk_cpu_trylock:A reads from __printk_cpu_unlock:B, then
+	 * __printk_cpu_unlock:A can never read from __printk_cpu_trylock:B.
+	 *
+	 * Relies on:
+	 *
+	 * RELEASE from __printk_cpu_unlock:A to __printk_cpu_unlock:B
+	 * of the previous CPU
+	 *    matching
+	 * ACQUIRE from __printk_cpu_trylock:A to __printk_cpu_trylock:B
+	 * of this CPU
+	 */
+	old = atomic_cmpxchg_acquire(&printk_cpulock_owner, -1,
+				     cpu); /* LMM(__printk_cpu_trylock:A) */
 	if (old == -1) {
-		/* This CPU is now the owner. */
+		/*
+		 * This CPU is now the owner and begins loading/storing
+		 * data: LMM(__printk_cpu_trylock:B)
+		 */
 		return 1;
+
 	} else if (old == cpu) {
 		/* This CPU is already the owner. */
 		printk_cpulock_nested = true;
@@ -3595,7 +3618,31 @@ void __printk_cpu_unlock(void)
 		return;
 	}
 
-	atomic_set_release(&printk_cpulock_owner, -1);
+	/*
+	 * This CPU is finished loading/storing data:
+	 * LMM(__printk_cpu_unlock:A)
+	 */
+
+	/*
+	 * Guarantee loads and stores from this CPU when it was the
+	 * lock owner are visible to the next lock owner. This pairs
+	 * with __printk_cpu_trylock:A.
+	 *
+	 * Memory barrier involvement:
+	 *
+	 * If __printk_cpu_trylock:A reads from __printk_cpu_unlock:B,
+	 * then __printk_cpu_trylock:B reads from __printk_cpu_unlock:A.
+	 *
+	 * Relies on:
+	 *
+	 * RELEASE from __printk_cpu_unlock:A to __printk_cpu_unlock:B
+	 * of this CPU
+	 *    matching
+	 * ACQUIRE from __printk_cpu_trylock:A to __printk_cpu_trylock:B
+	 * of the next CPU
+	 */
+	atomic_set_release(&printk_cpulock_owner,
+			   -1); /* LMM(__printk_cpu_unlock:B) */
 }
 EXPORT_SYMBOL(__printk_cpu_unlock);
 #endif /* CONFIG_SMP */
-- 
2.20.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH next v3 1/2] dump_stack: move cpu lock to printk.c
  2021-06-15 17:49 ` [PATCH next v3 1/2] dump_stack: move cpu lock to printk.c John Ogness
@ 2021-06-15 21:33   ` John Ogness
  2021-06-16  7:06     ` Sergey Senozhatsky
  0 siblings, 1 reply; 10+ messages in thread
From: John Ogness @ 2021-06-15 21:33 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Stephen Rothwell, Andrew Morton,
	Daniel Bristot de Oliveira, Stephen Boyd, Alexander Potapenko,
	Peter Zijlstra

On 2021-06-15, John Ogness <john.ogness@linutronix.de> wrote:
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 114e9963f903..5369d8f33299 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -3532,3 +3532,70 @@ void kmsg_dump_rewind(struct kmsg_dump_iter *iter)
>  EXPORT_SYMBOL_GPL(kmsg_dump_rewind);
>  
>  #endif
> +
> +#ifdef CONFIG_SMP
> +static atomic_t printk_cpulock_owner = ATOMIC_INIT(-1);
> +static bool printk_cpulock_nested;

I just realized that @printk_cpulock_nested will need to be an atomic_t
counter to allow multiple nested levels since nesting can also occur
because of recursion and not only because of an interrupting NMI
context. So a v4 will be needed for that simple change. But please still
comment on the rest.

Thanks.

John Ogness

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH next v3 1/2] dump_stack: move cpu lock to printk.c
  2021-06-15 21:33   ` John Ogness
@ 2021-06-16  7:06     ` Sergey Senozhatsky
  2021-06-16  7:29       ` John Ogness
  0 siblings, 1 reply; 10+ messages in thread
From: Sergey Senozhatsky @ 2021-06-16  7:06 UTC (permalink / raw)
  To: John Ogness
  Cc: Petr Mladek, Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Stephen Rothwell, Andrew Morton,
	Daniel Bristot de Oliveira, Stephen Boyd, Alexander Potapenko,
	Peter Zijlstra

On (21/06/15 23:39), John Ogness wrote:
> On 2021-06-15, John Ogness <john.ogness@linutronix.de> wrote:
> > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> > index 114e9963f903..5369d8f33299 100644
> > --- a/kernel/printk/printk.c
> > +++ b/kernel/printk/printk.c
> > @@ -3532,3 +3532,70 @@ void kmsg_dump_rewind(struct kmsg_dump_iter *iter)
> >  EXPORT_SYMBOL_GPL(kmsg_dump_rewind);
> >  
> >  #endif
> > +
> > +#ifdef CONFIG_SMP
> > +static atomic_t printk_cpulock_owner = ATOMIC_INIT(-1);
> > +static bool printk_cpulock_nested;
> 
> I just realized that @printk_cpulock_nested will need to be an atomic_t
> counter to allow multiple nested levels since nesting can also occur

Strictly speaking, this is not nested printk, right? printk recursion is
handled in printk separately. This one is more like "nested dump_stack()-s",
or nested "error reporinting".

Shall this be a separate patch? Because the original code has never
limited nested error reporting contexts.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH next v3 1/2] dump_stack: move cpu lock to printk.c
  2021-06-16  7:06     ` Sergey Senozhatsky
@ 2021-06-16  7:29       ` John Ogness
  2021-06-16 11:21         ` Petr Mladek
  2021-06-16 11:55         ` Sergey Senozhatsky
  0 siblings, 2 replies; 10+ messages in thread
From: John Ogness @ 2021-06-16  7:29 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Stephen Rothwell, Andrew Morton,
	Daniel Bristot de Oliveira, Stephen Boyd, Alexander Potapenko,
	Peter Zijlstra

On 2021-06-16, Sergey Senozhatsky <senozhatsky@chromium.org> wrote:
> On (21/06/15 23:39), John Ogness wrote:
>> On 2021-06-15, John Ogness <john.ogness@linutronix.de> wrote:
>> > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
>> > index 114e9963f903..5369d8f33299 100644
>> > --- a/kernel/printk/printk.c
>> > +++ b/kernel/printk/printk.c
>> > @@ -3532,3 +3532,70 @@ void kmsg_dump_rewind(struct kmsg_dump_iter *iter)
>> >  EXPORT_SYMBOL_GPL(kmsg_dump_rewind);
>> >  
>> >  #endif
>> > +
>> > +#ifdef CONFIG_SMP
>> > +static atomic_t printk_cpulock_owner = ATOMIC_INIT(-1);
>> > +static bool printk_cpulock_nested;
>> 
>> I just realized that @printk_cpulock_nested will need to be an atomic_t
>> counter to allow multiple nested levels since nesting can also occur
>
> Strictly speaking, this is not nested printk, right? printk recursion is
> handled in printk separately. This one is more like "nested dump_stack()-s",
> or nested "error reporinting".
>
> Because the original code has never limited nested error reporting
> contexts.

It isn't about limiting. It is about tracking. The current dump_stack()
handles it correctly because the tracking is done in the stack frame of
the caller (in @was_locked of dump_stack_lvl()). My previous versions
also handled it correctly by using the same technique.

With this series version I moved the tracking into a global variable
@printk_cpulock_nested, which is fine, except that a boolean is not
capable of tracking more than 1 nesting. Which means that
__printk_cpu_unlock() would release cpu lock ownership too soon.

Doing this correctly is a simple change:

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index e67dc510fa1b..5376216e4f3d 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -3535,7 +3535,7 @@ EXPORT_SYMBOL_GPL(kmsg_dump_rewind);
 
 #ifdef CONFIG_SMP
 static atomic_t printk_cpulock_owner = ATOMIC_INIT(-1);
-static bool printk_cpulock_nested;
+static atomic_t printk_cpulock_nested = ATOMIC_INIT(0);
 
 /**
  * __printk_wait_on_cpu_lock() - Busy wait until the printk cpu-reentrant
@@ -3596,7 +3598,7 @@ int __printk_cpu_trylock(void)
 
 	} else if (old == cpu) {
 		/* This CPU is already the owner. */
-		printk_cpulock_nested = true;
+		atomic_inc(&printk_cpulock_nested);
 		return 1;
 	}
 
@@ -3613,8 +3615,8 @@ EXPORT_SYMBOL(__printk_cpu_trylock);
  */
 void __printk_cpu_unlock(void)
 {
-	if (printk_cpulock_nested) {
-		printk_cpulock_nested = false;
+	if (atomic_read(&printk_cpulock_nested)) {
+		atomic_dec(&printk_cpulock_nested);
 		return;
 	}

> Shall this be a separate patch?

I would prefer a v4 because I also noticed that this patch accidentally
implements atomic_set_release() instead of moving over the atomit_set()
from dump_stack(). That also needs to be corrected, otherwise the next
patch in the series makes no sense.

John Ogness

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH next v3 1/2] dump_stack: move cpu lock to printk.c
  2021-06-16  7:29       ` John Ogness
@ 2021-06-16 11:21         ` Petr Mladek
  2021-06-16 13:40           ` John Ogness
  2021-06-16 11:55         ` Sergey Senozhatsky
  1 sibling, 1 reply; 10+ messages in thread
From: Petr Mladek @ 2021-06-16 11:21 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Stephen Rothwell, Andrew Morton,
	Daniel Bristot de Oliveira, Stephen Boyd, Alexander Potapenko,
	Peter Zijlstra

On Wed 2021-06-16 09:35:35, John Ogness wrote:
> On 2021-06-16, Sergey Senozhatsky <senozhatsky@chromium.org> wrote:
> It isn't about limiting. It is about tracking. The current dump_stack()
> handles it correctly because the tracking is done in the stack frame of
> the caller (in @was_locked of dump_stack_lvl()). My previous versions
> also handled it correctly by using the same technique.
> 
> With this series version I moved the tracking into a global variable
> @printk_cpulock_nested, which is fine, except that a boolean is not
> capable of tracking more than 1 nesting. Which means that
> __printk_cpu_unlock() would release cpu lock ownership too soon.
> 
> Doing this correctly is a simple change:
> 
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index e67dc510fa1b..5376216e4f3d 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -3535,7 +3535,7 @@ EXPORT_SYMBOL_GPL(kmsg_dump_rewind);
>  
>  #ifdef CONFIG_SMP
>  static atomic_t printk_cpulock_owner = ATOMIC_INIT(-1);
> -static bool printk_cpulock_nested;
> +static atomic_t printk_cpulock_nested = ATOMIC_INIT(0);
>  
>  /**
>   * __printk_wait_on_cpu_lock() - Busy wait until the printk cpu-reentrant
> @@ -3596,7 +3598,7 @@ int __printk_cpu_trylock(void)
>  
>  	} else if (old == cpu) {
>  		/* This CPU is already the owner. */
> -		printk_cpulock_nested = true;
> +		atomic_inc(&printk_cpulock_nested);
>  		return 1;
>  	}
>  
> @@ -3613,8 +3615,8 @@ EXPORT_SYMBOL(__printk_cpu_trylock);
>   */
>  void __printk_cpu_unlock(void)
>  {
> -	if (printk_cpulock_nested) {
> -		printk_cpulock_nested = false;
> +	if (atomic_read(&printk_cpulock_nested)) {
> +		atomic_dec(&printk_cpulock_nested);

I think about handling printk_cpulock_nested with only one
atomic operation. Something like:

	if (atomic_dec_return(&printk_cpulock_level) == 0)
		atomic_set_release(&printk_cpulock_owner, -1);

It would require always incremanting the number in lock, e.g.

	old = atomic_cmpxchg(&printk_cpulock_owner, -1, cpu);
	if (old == -1 || old == cpu) {
		atomic_inc(&printk_cpulock_level);
		return 1;
	}

But I am not sure if it is really better. Feel free to keep
your variant.

>  		return;
>  	}
> 
> > Shall this be a separate patch?
> 
> I would prefer a v4 because I also noticed that this patch accidentally
> implements atomic_set_release() instead of moving over the atomit_set()
> from dump_stack(). That also needs to be corrected, otherwise the next
> patch in the series makes no sense.

Yes, this needs to get fixed as well.

Otherwise, the patch looks good to me. I haven't found any other
problems, except for the two already mentioned (count nested levels,
introduce atomic_set_release() in 2nd patch).

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH next v3 2/2] printk: fix cpu lock ordering
  2021-06-15 17:49 ` [PATCH next v3 2/2] printk: fix cpu lock ordering John Ogness
@ 2021-06-16 11:30   ` Petr Mladek
  0 siblings, 0 replies; 10+ messages in thread
From: Petr Mladek @ 2021-06-16 11:30 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Paul E. McKenney

On Tue 2021-06-15 19:55:47, John Ogness wrote:
> The cpu lock implementation uses a full memory barrier to take
> the lock, but no memory barriers when releasing the lock. This
> means that changes performed by a lock owner may not be seen by
> the next lock owner. This may have been "good enough" for use
> by dump_stack() as a serialization mechanism, but it is not
> enough to provide proper protection for a critical section.
> 
> Correct this problem by using acquire/release memory barriers
> for lock/unlock, respectively.
> 
> Note that it is not necessary for a cpu lock to disable
> interrupts. However, in upcoming work this cpu lock will be used
> for emergency tasks (for example, atomic consoles during kernel
> crashes) and any interruptions while holding the cpu lock should
> be avoided if possible.
> 
> Signed-off-by: John Ogness <john.ogness@linutronix.de>

The patch looks good to me, except for the problems already
reported for the 1st patch.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH next v3 1/2] dump_stack: move cpu lock to printk.c
  2021-06-16  7:29       ` John Ogness
  2021-06-16 11:21         ` Petr Mladek
@ 2021-06-16 11:55         ` Sergey Senozhatsky
  1 sibling, 0 replies; 10+ messages in thread
From: Sergey Senozhatsky @ 2021-06-16 11:55 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Petr Mladek, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Stephen Rothwell, Andrew Morton,
	Daniel Bristot de Oliveira, Stephen Boyd, Alexander Potapenko,
	Peter Zijlstra

On (21/06/16 09:35), John Ogness wrote:
> It isn't about limiting. It is about tracking.

Oh, yes. I missed it.
>
[..]

> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index e67dc510fa1b..5376216e4f3d 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -3535,7 +3535,7 @@ EXPORT_SYMBOL_GPL(kmsg_dump_rewind);
>  
>  #ifdef CONFIG_SMP
>  static atomic_t printk_cpulock_owner = ATOMIC_INIT(-1);
> -static bool printk_cpulock_nested;
> +static atomic_t printk_cpulock_nested = ATOMIC_INIT(0);
>  
>  /**
>   * __printk_wait_on_cpu_lock() - Busy wait until the printk cpu-reentrant
> @@ -3596,7 +3598,7 @@ int __printk_cpu_trylock(void)
>  
>  	} else if (old == cpu) {
>  		/* This CPU is already the owner. */
> -		printk_cpulock_nested = true;
> +		atomic_inc(&printk_cpulock_nested);
>  		return 1;
>  	}
>  
> @@ -3613,8 +3615,8 @@ EXPORT_SYMBOL(__printk_cpu_trylock);
>   */
>  void __printk_cpu_unlock(void)
>  {
> -	if (printk_cpulock_nested) {
> -		printk_cpulock_nested = false;
> +	if (atomic_read(&printk_cpulock_nested)) {
> +		atomic_dec(&printk_cpulock_nested);
>  		return;
>  	}

Looks good.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH next v3 1/2] dump_stack: move cpu lock to printk.c
  2021-06-16 11:21         ` Petr Mladek
@ 2021-06-16 13:40           ` John Ogness
  0 siblings, 0 replies; 10+ messages in thread
From: John Ogness @ 2021-06-16 13:40 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Stephen Rothwell, Andrew Morton,
	Daniel Bristot de Oliveira, Stephen Boyd, Alexander Potapenko,
	Peter Zijlstra

On 2021-06-16, Petr Mladek <pmladek@suse.com> wrote:
>> With this series version I moved the tracking into a global variable
>> @printk_cpulock_nested, which is fine, except that a boolean is not
>> capable of tracking more than 1 nesting. Which means that
>> __printk_cpu_unlock() would release cpu lock ownership too soon.
>> 
>> Doing this correctly is a simple change:
>> 
>> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
>> index e67dc510fa1b..5376216e4f3d 100644
>> --- a/kernel/printk/printk.c
>> +++ b/kernel/printk/printk.c
>> @@ -3535,7 +3535,7 @@ EXPORT_SYMBOL_GPL(kmsg_dump_rewind);
>>  
>>  #ifdef CONFIG_SMP
>>  static atomic_t printk_cpulock_owner = ATOMIC_INIT(-1);
>> -static bool printk_cpulock_nested;
>> +static atomic_t printk_cpulock_nested = ATOMIC_INIT(0);
>>  
>>  /**
>>   * __printk_wait_on_cpu_lock() - Busy wait until the printk cpu-reentrant
>> @@ -3596,7 +3598,7 @@ int __printk_cpu_trylock(void)
>>  
>>  	} else if (old == cpu) {
>>  		/* This CPU is already the owner. */
>> -		printk_cpulock_nested = true;
>> +		atomic_inc(&printk_cpulock_nested);
>>  		return 1;
>>  	}
>>  
>> @@ -3613,8 +3615,8 @@ EXPORT_SYMBOL(__printk_cpu_trylock);
>>   */
>>  void __printk_cpu_unlock(void)
>>  {
>> -	if (printk_cpulock_nested) {
>> -		printk_cpulock_nested = false;
>> +	if (atomic_read(&printk_cpulock_nested)) {
>> +		atomic_dec(&printk_cpulock_nested);
>
> I think about handling printk_cpulock_nested with only one
> atomic operation. Something like:
>
> 	if (atomic_dec_return(&printk_cpulock_level) == 0)
> 		atomic_set_release(&printk_cpulock_owner, -1);
>
> It would require always incremanting the number in lock, e.g.
>
> 	old = atomic_cmpxchg(&printk_cpulock_owner, -1, cpu);
> 	if (old == -1 || old == cpu) {
> 		atomic_inc(&printk_cpulock_level);
> 		return 1;
> 	}

I actually implemented similar code during an internal draft. I later
decided against it, mainly because I prefer to keep the old==-1 and
old==cpu cases separate.

Also note that atomic_dec_return() introduces an unnecessary memory
barrier. If we take your proposed implementation we would use
atomic_dec_return_relaxed() instead.

> But I am not sure if it is really better. Feel free to keep
> your variant.

*sigh* Frankly, I don't care much. My variant saves a few CPU
instructions for the normal case (non-nested), but that probably is not
much of an argument.

For v4 I will keep my variant because it explicitly handles the
non-nested/nested cases separately, which helps when adding the memory
barrier comments in the follow-up patch. In particular, the label
LMM(__printk_cpu_trylock:B), which represents the first moment a new CPU
begins to load/store data, only applies to the old==-1 condition.

John Ogness

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-06-16 13:40 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-15 17:49 [PATCH next v3 0/2] introduce printk cpu lock John Ogness
2021-06-15 17:49 ` [PATCH next v3 1/2] dump_stack: move cpu lock to printk.c John Ogness
2021-06-15 21:33   ` John Ogness
2021-06-16  7:06     ` Sergey Senozhatsky
2021-06-16  7:29       ` John Ogness
2021-06-16 11:21         ` Petr Mladek
2021-06-16 13:40           ` John Ogness
2021-06-16 11:55         ` Sergey Senozhatsky
2021-06-15 17:49 ` [PATCH next v3 2/2] printk: fix cpu lock ordering John Ogness
2021-06-16 11:30   ` Petr Mladek

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.