[PATCH 0/2] panic/printk/x86: Prevent some more printk-related deadlocks in panic()

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/2] panic/printk/x86: Prevent some more printk-related deadlocks in panic()
@ 2019-07-16  7:28 Petr Mladek
  2019-07-16  7:28 ` [PATCH 1/2] printk/panic: Access the main printk log in panic() only when safe Petr Mladek
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Petr Mladek @ 2019-07-16  7:28 UTC (permalink / raw)
  To: Sergey Senozhatsky, Steven Rostedt, Peter Zijlstra, Andrew Morton
  Cc: Thomas Gleixner, John Ogness, Petr Tesarik,
	Konstantin Khlebnikov, x86, linux-kernel, Petr Mladek

Hi,

I have found some spare duct tape and wrapped some more printk-related
deadlocks in panic().

More seriously, someone reported a deadlock in panic(). Some non-trivial
debugging pointed out a problem with the following combination:

     + x86_64 architecture
     + panic()
     + pstore configured as message dumper (kmsg_dump())
     + crash kernel configured
     + crash_kexec_post_notifiers

In this case, CPUs are stopped by crash_smp_send_stop(). It uses
NMI but it does not modify cpu_online_mask. Therefore logbuf_lock
might stay locked, see 2nd patch for more details.

The above is a real corner case. But similar problem seems to be
even in the common situations on architectures that do not use
NMI in smp_send_stop() as a fallback, see 1st patch.

Back to the duct tape. I hope that we will get rid of these problems
with the lockless printk ringbuffer rather sooner than later.
But it still might take some time. And the two fixes might be
useful also for stable kernels.

Petr Mladek (2):
  printk/panic: Access the main printk log in panic() only when safe
  printk/panic/x86: Allow to access printk log buffer after
    crash_smp_send_stop()

 arch/x86/kernel/crash.c     |  6 +++++-
 include/linux/printk.h      |  6 ++++++
 kernel/panic.c              | 49 +++++++++++++++++++++++++++------------------
 kernel/printk/printk_safe.c | 37 ++++++++++++++++++++++------------
 4 files changed, 65 insertions(+), 33 deletions(-)

-- 
2.16.4

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/2] printk/panic: Access the main printk log in panic() only when safe
  2019-07-16  7:28 [PATCH 0/2] panic/printk/x86: Prevent some more printk-related deadlocks in panic() Petr Mladek
@ 2019-07-16  7:28 ` Petr Mladek
  2019-07-17  9:56   ` Sergey Senozhatsky
  2019-07-18 10:11   ` Sergey Senozhatsky
  2019-07-16  7:28 ` [PATCH 2/2] printk/panic/x86: Allow to access printk log buffer after crash_smp_send_stop() Petr Mladek
  2019-07-18  9:59 ` [PATCH 0/2] panic/printk/x86: Prevent some more printk-related deadlocks in panic() Konstantin Khlebnikov
  2 siblings, 2 replies; 17+ messages in thread
From: Petr Mladek @ 2019-07-16  7:28 UTC (permalink / raw)
  To: Sergey Senozhatsky, Steven Rostedt, Peter Zijlstra, Andrew Morton
  Cc: Thomas Gleixner, John Ogness, Petr Tesarik,
	Konstantin Khlebnikov, x86, linux-kernel, Petr Mladek

Kernel tries hard to store and show printk messages when panicking. Even
logbuf_lock gets re-initialized when only one CPU is running after
smp_send_stop().

Unfortunately, smp_send_stop() might fail on architectures that do not
use NMI as a fallback. Then printk log buffer might stay locked and
a deadlock is almost inevitable.

Now, printk_safe_flush_on_panic() is safe because it accesses the main
log buffer only when the lock is available. But kmgs_dump() and
console_unlock() callers might still cause a deadlock.

A better approach is to move printk lock busting into a separate function.
And call all dependent operations only when the lock is not blocked.

Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 include/linux/printk.h      |  6 ++++++
 kernel/panic.c              | 49 +++++++++++++++++++++++++++------------------
 kernel/printk/printk_safe.c | 35 ++++++++++++++++++++------------
 3 files changed, 58 insertions(+), 32 deletions(-)

diff --git a/include/linux/printk.h b/include/linux/printk.h
index cefd374c47b1..4d15a0eda9c6 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -202,6 +202,7 @@ __printf(1, 2) void dump_stack_set_arch_desc(const char *fmt, ...);
 void dump_stack_print_info(const char *log_lvl);
 void show_regs_print_info(const char *log_lvl);
 extern asmlinkage void dump_stack(void) __cold;
+extern int printk_bust_lock_safe(void);
 extern void printk_safe_init(void);
 extern void printk_safe_flush(void);
 extern void printk_safe_flush_on_panic(void);
@@ -269,6 +270,11 @@ static inline void dump_stack(void)
 {
 }
 
+static inline int printk_bust_lock_safe(void)
+{
+	return 0;
+}
+
 static inline void printk_safe_init(void)
 {
 }
diff --git a/kernel/panic.c b/kernel/panic.c
index 4d9f55bf7d38..aa50cdb75022 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -155,6 +155,26 @@ static void panic_print_sys_info(void)
 		ftrace_dump(DUMP_ALL);
 }
 
+/* Try hard to make the messages visible on the console. */
+void panic_console_dump(void)
+{
+#ifdef CONFIG_VT
+	unblank_screen();
+#endif
+	console_unblank();
+
+	/*
+	 * We may have ended up stopping the CPU holding the lock (in
+	 * smp_send_stop()) while still having some valuable data in the console
+	 * buffer.  Try to acquire the lock then release it regardless of the
+	 * result.  The release will also print the buffers out.  Locks debug
+	 * should be disabled to avoid reporting bad unlock balance when
+	 * panic() is not being callled from OOPS.
+	 */
+	debug_locks_off();
+	console_flush_on_panic(CONSOLE_FLUSH_PENDING);
+}
+
 /**
  *	panic - halt the system
  *	@fmt: The text string to print
@@ -169,6 +189,7 @@ void panic(const char *fmt, ...)
 	va_list args;
 	long i, i_next = 0, len;
 	int state = 0;
+	int printk_blocked;
 	int old_cpu, this_cpu;
 	bool _crash_kexec_post_notifiers = crash_kexec_post_notifiers;
 
@@ -253,8 +274,11 @@ void panic(const char *fmt, ...)
 	atomic_notifier_call_chain(&panic_notifier_list, 0, buf);
 
 	/* Call flush even twice. It tries harder with a single online CPU */
-	printk_safe_flush_on_panic();
-	kmsg_dump(KMSG_DUMP_PANIC);
+	printk_blocked = printk_bust_lock_safe();
+	if (!printk_blocked) {
+		printk_safe_flush_on_panic();
+		kmsg_dump(KMSG_DUMP_PANIC);
+	}
 
 	/*
 	 * If you doubt kdump always works fine in any situation,
@@ -268,23 +292,10 @@ void panic(const char *fmt, ...)
 	if (_crash_kexec_post_notifiers)
 		__crash_kexec(NULL);
 
-#ifdef CONFIG_VT
-	unblank_screen();
-#endif
-	console_unblank();
-
-	/*
-	 * We may have ended up stopping the CPU holding the lock (in
-	 * smp_send_stop()) while still having some valuable data in the console
-	 * buffer.  Try to acquire the lock then release it regardless of the
-	 * result.  The release will also print the buffers out.  Locks debug
-	 * should be disabled to avoid reporting bad unlock balance when
-	 * panic() is not being callled from OOPS.
-	 */
-	debug_locks_off();
-	console_flush_on_panic(CONSOLE_FLUSH_PENDING);
-
-	panic_print_sys_info();
+	if (!printk_blocked) {
+		panic_console_dump();
+		panic_print_sys_info();
+	}
 
 	if (!panic_blink)
 		panic_blink = no_blink;
diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
index b4045e782743..71d4b763f811 100644
--- a/kernel/printk/printk_safe.c
+++ b/kernel/printk/printk_safe.c
@@ -253,6 +253,26 @@ void printk_safe_flush(void)
 	}
 }
 
+/**
+ * printk_try_bust_lock - make printk log accessible when safe
+ *
+ * Return 0 when the log is accessible. Return -EWOULDBLOCK when
+ * it is not safe and likely to cause a deadlock.
+ */
+int printk_bust_lock_safe(void)
+{
+	if (!raw_spin_is_locked(&logbuf_lock))
+		return 0;
+
+	if (num_online_cpus() == 1) {
+		debug_locks_off();
+		raw_spin_lock_init(&logbuf_lock);
+		return 0;
+	}
+
+	return -EWOULDBLOCK;
+}
+
 /**
  * printk_safe_flush_on_panic - flush all per-cpu nmi buffers when the system
  *	goes down.
@@ -265,19 +285,8 @@ void printk_safe_flush(void)
  */
 void printk_safe_flush_on_panic(void)
 {
-	/*
-	 * Make sure that we could access the main ring buffer.
-	 * Do not risk a double release when more CPUs are up.
-	 */
-	if (raw_spin_is_locked(&logbuf_lock)) {
-		if (num_online_cpus() > 1)
-			return;
-
-		debug_locks_off();
-		raw_spin_lock_init(&logbuf_lock);
-	}
-
-	printk_safe_flush();
+	if (printk_bust_lock_safe() == 0)
+		printk_safe_flush();
 }
 
 #ifdef CONFIG_PRINTK_NMI
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] printk/panic: Access the main printk log in panic() only when safe
  2019-07-16  7:28 ` [PATCH 1/2] printk/panic: Access the main printk log in panic() only when safe Petr Mladek
@ 2019-07-17  9:56   ` Sergey Senozhatsky
  2019-07-18  8:36     ` Petr Mladek
  2019-07-18 10:11   ` Sergey Senozhatsky
  1 sibling, 1 reply; 17+ messages in thread
From: Sergey Senozhatsky @ 2019-07-17  9:56 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Steven Rostedt, Peter Zijlstra,
	Andrew Morton, Thomas Gleixner, John Ogness, Petr Tesarik,
	Konstantin Khlebnikov, x86, linux-kernel

On (07/16/19 09:28), Petr Mladek wrote:
> Kernel tries hard to store and show printk messages when panicking. Even
> logbuf_lock gets re-initialized when only one CPU is running after
> smp_send_stop().
> 
> Unfortunately, smp_send_stop() might fail on architectures that do not
> use NMI as a fallback. Then printk log buffer might stay locked and
> a deadlock is almost inevitable.

I'd say that deadlock is still almost inevitable.

panic-CPU syncs with the printing-CPU before it attempts to SMP_STOP.
If there is an active printing-CPU, which is looping in console_unlock(),
taking logbuf_lock in order to msg_print_text() and stuff, then panic-CPU
will spin on console_owner waiting for that printing-CPU to handover
printing duties.

	pr_emerg("Kernel panic - not syncing");
	smp_send_stop();


If printing-CPU goes nuts under logbuf_lock, has corrupted IDT or anything
else, then we will not progress with panic(). panic-CPU will deadlock. If
not on
	pr_emerg("Kernel panic - not syncing")

then on another pr_emerg(), right before the NMI-fallback.

	static void native_stop_other_cpus()
	{
	...
		pr_emerg("Shutting down cpus with NMI\n");
		           ^^ deadlock here
		apic->send_IPI_allbutself(NMI_VECTOR);
		           ^^ not going to happen
	...
	}

And it's not only x86. In many cases if we fail to SMP_STOP other
CPUs, and one of hem is holding logbuf_lock then we are done with
panic(). We will not return from smp_send_stop().

arm/kernel/smp.c

void smp_send_stop(void)
{
	...
	if (num_online_cpus() > 1)
		pr_warn("SMP: failed to stop secondary CPUs\n");
}

arm64/kernel/smp.c

void crash_smp_send_stop(void)
{
	...
	pr_crit("SMP: stopping secondary CPUs\n");
	smp_cross_call(&mask, IPI_CPU_CRASH_STOP);

	...
	if (atomic_read(&waiting_for_crash_ipi) > 0)
		pr_warning("SMP: failed to stop secondary CPUs %*pbl\n",
				cpumask_pr_args(&mask));
	...
}

arm64/kernel/smp.c

void smp_send_stop(void)
{
	...
	if (num_online_cpus() > 1)
		pr_warning("SMP: failed to stop secondary CPUs %*pbl\n",
				cpumask_pr_args(cpu_online_mask));
	...
}


riscv/kernel/smp.c

void smp_send_stop(void)
{
	...
	if (num_online_cpus() > 1)
		pr_warn("SMP: failed to stop secondary CPUs %*pbl\n",
			cpumask_pr_args(cpu_online_mask));
	...
}

And so on.

	-ss

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] printk/panic: Access the main printk log in panic() only when safe
  2019-07-17  9:56   ` Sergey Senozhatsky
@ 2019-07-18  8:36     ` Petr Mladek
  2019-07-18  9:49       ` Sergey Senozhatsky
  0 siblings, 1 reply; 17+ messages in thread
From: Petr Mladek @ 2019-07-18  8:36 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Konstantin Khlebnikov, Sergey Senozhatsky, Steven Rostedt,
	Peter Zijlstra, x86, John Ogness, Thomas Gleixner, Andrew Morton,
	Petr Tesarik, linux-kernel

On Wed 2019-07-17 18:56:15, Sergey Senozhatsky wrote:
> On (07/16/19 09:28), Petr Mladek wrote:
> > Kernel tries hard to store and show printk messages when panicking. Even
> > logbuf_lock gets re-initialized when only one CPU is running after
> > smp_send_stop().
> > 
> > Unfortunately, smp_send_stop() might fail on architectures that do not
> > use NMI as a fallback. Then printk log buffer might stay locked and
> > a deadlock is almost inevitable.
> 
> I'd say that deadlock is still almost inevitable.
> 
> panic-CPU syncs with the printing-CPU before it attempts to SMP_STOP.
> If there is an active printing-CPU, which is looping in console_unlock(),
> taking logbuf_lock in order to msg_print_text() and stuff, then panic-CPU
> will spin on console_owner waiting for that printing-CPU to handover
> printing duties.
> 
> 	pr_emerg("Kernel panic - not syncing");
> 	smp_send_stop();

Good point. I forgot the handover logic. Well, it is enabled only
around call_console_drivers(). Therefore it is not under
lockbuf_lock.

I had in mind some infinite loop or deadlock in vprintk_store().
There was at least one long time ago (warning triggered
by leap second).

> If printing-CPU goes nuts under logbuf_lock, has corrupted IDT or anything
> else, then we will not progress with panic(). panic-CPU will deadlock. If
> not on
> 	pr_emerg("Kernel panic - not syncing")
> 
> then on another pr_emerg(), right before the NMI-fallback.

Nested printk() should not be problem thanks to printk_safe.

Also printk_safe_flush_on_panic() is safe because it checks whether
the lock is available.

The problem is kmsg_dump() and console_unlock() called from
console_unblank() and console_flush_on_panic(). They do not
check whether the lock is available.

This patch does not help in all possible scenarios. But I still
believe that it will help in some.

Well, I am primary interested into the 2nd patch. It fixes
a real life bug report.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] printk/panic: Access the main printk log in panic() only when safe
  2019-07-18  8:36     ` Petr Mladek
@ 2019-07-18  9:49       ` Sergey Senozhatsky
  2019-07-19 12:57         ` Petr Mladek
  0 siblings, 1 reply; 17+ messages in thread
From: Sergey Senozhatsky @ 2019-07-18  9:49 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Konstantin Khlebnikov, Sergey Senozhatsky,
	Steven Rostedt, Peter Zijlstra, x86, John Ogness,
	Thomas Gleixner, Andrew Morton, Petr Tesarik, linux-kernel

On (07/18/19 10:36), Petr Mladek wrote:
> On Wed 2019-07-17 18:56:15, Sergey Senozhatsky wrote:
> > On (07/16/19 09:28), Petr Mladek wrote:
> > > Kernel tries hard to store and show printk messages when panicking. Even
> > > logbuf_lock gets re-initialized when only one CPU is running after
> > > smp_send_stop().
> > > 
> > > Unfortunately, smp_send_stop() might fail on architectures that do not
> > > use NMI as a fallback. Then printk log buffer might stay locked and
> > > a deadlock is almost inevitable.
> > 
> > I'd say that deadlock is still almost inevitable.
> > 
> > panic-CPU syncs with the printing-CPU before it attempts to SMP_STOP.
> > If there is an active printing-CPU, which is looping in console_unlock(),
> > taking logbuf_lock in order to msg_print_text() and stuff, then panic-CPU
> > will spin on console_owner waiting for that printing-CPU to handover
> > printing duties.
> > 
> > 	pr_emerg("Kernel panic - not syncing");
> > 	smp_send_stop();
> 
> Good point. I forgot the handover logic. Well, it is enabled only
> around call_console_drivers(). Therefore it is not under
> lockbuf_lock.
> 
> I had in mind some infinite loop or deadlock in vprintk_store().
> There was at least one long time ago (warning triggered
> by leap second).
> 
> 
> > If printing-CPU goes nuts under logbuf_lock, has corrupted IDT or anything
> > else, then we will not progress with panic(). panic-CPU will deadlock. If
> > not on
> > 	pr_emerg("Kernel panic - not syncing")
> > 
> > then on another pr_emerg(), right before the NMI-fallback.
> 
> Nested printk() should not be problem thanks to printk_safe.

Where do nested printk()-s come from? Which one of the following
scenarios you cover in commit message:

scenario 1

- we have CPUB which holds logbuf_lock
- we have CPUA which panic()-s the system, but can't bring CPUB down,
  so logbuf_lock stays locked on remote CPU

scenario 2

- we have CPUA which holds logbuf_lock
- we have panic() on CPUA, but it cannot bring down some other CPUB
  so logbuf_lock stays locked on local CPU, and it cannot re-init
  logbuf.

	-ss

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] printk/panic: Access the main printk log in panic() only when safe
  2019-07-18  9:49       ` Sergey Senozhatsky
@ 2019-07-19 12:57         ` Petr Mladek
  2019-07-23  3:13           ` Sergey Senozhatsky
  0 siblings, 1 reply; 17+ messages in thread
From: Petr Mladek @ 2019-07-19 12:57 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Konstantin Khlebnikov, Sergey Senozhatsky, Steven Rostedt,
	Peter Zijlstra, x86, John Ogness, Thomas Gleixner, Andrew Morton,
	Petr Tesarik, linux-kernel

On Thu 2019-07-18 18:49:34, Sergey Senozhatsky wrote:
> On (07/18/19 10:36), Petr Mladek wrote:
> > On Wed 2019-07-17 18:56:15, Sergey Senozhatsky wrote:
> > > On (07/16/19 09:28), Petr Mladek wrote:
> > > > Kernel tries hard to store and show printk messages when panicking. Even
> > > > logbuf_lock gets re-initialized when only one CPU is running after
> > > > smp_send_stop().
> > > > 
> > > > Unfortunately, smp_send_stop() might fail on architectures that do not
> > > > use NMI as a fallback. Then printk log buffer might stay locked and
> > > > a deadlock is almost inevitable.
> > > 
> > > I'd say that deadlock is still almost inevitable.
> > > 
> > > panic-CPU syncs with the printing-CPU before it attempts to SMP_STOP.
> > > If there is an active printing-CPU, which is looping in console_unlock(),
> > > taking logbuf_lock in order to msg_print_text() and stuff, then panic-CPU
> > > will spin on console_owner waiting for that printing-CPU to handover
> > > printing duties.
> > > 
> > > 	pr_emerg("Kernel panic - not syncing");
> > > 	smp_send_stop();
> > 
> > Good point. I forgot the handover logic. Well, it is enabled only
> > around call_console_drivers(). Therefore it is not under
> > lockbuf_lock.
> > 
> > I had in mind some infinite loop or deadlock in vprintk_store().
> > There was at least one long time ago (warning triggered
> > by leap second).
> > 
> > 
> > > If printing-CPU goes nuts under logbuf_lock, has corrupted IDT or anything
> > > else, then we will not progress with panic(). panic-CPU will deadlock. If
> > > not on
> > > 	pr_emerg("Kernel panic - not syncing")
> > > 
> > > then on another pr_emerg(), right before the NMI-fallback.
> > 
> > Nested printk() should not be problem thanks to printk_safe.
> 
> Where do nested printk()-s come from? Which one of the following
> scenarios you cover in commit message:
> 
> scenario 1
> 
> - we have CPUB which holds logbuf_lock
> - we have CPUA which panic()-s the system, but can't bring CPUB down,
>   so logbuf_lock stays locked on remote CPU

No, this scenario is not affected by this patch. It would always lead to
a deadlock.

> scenario 2
> 
> - we have CPUA which holds logbuf_lock
> - we have panic() on CPUA, but it cannot bring down some other CPUB
>   so logbuf_lock stays locked on local CPU, and it cannot re-init
>   logbuf.

This scenario should get better handled by this patch. The difference
will be when smp_send_stop() is not able to stop all CPUs:

  + Before:
      + printk_safe_flush_on_panic() will keep logbuf_lock locked
	and do nothing.

      + kmsg_dump(), console_unblank(), or console_flush_on_panic()
	will deadlock when they try to get logbuf_lock(). They will
	not be able to process any single line.

  + After:
      + printk_bust_lock_safe() will keep logbuf_lock locked

      + All functions using logbuf_lock will not get called.
	We will not see the messages (as previously) but the
	system will not deadlock.


But there is one more scenario 3:

  - we have CPUB which loops or is deadlocked in IRQ context

  - we have CPUA which panic()-s the system, but can't bring CPUB down,
    so logbuf_lock might be takes and release from time to time
    by CPUB

Hmm, this scenario might be handled a bit _worse_ by this patch:

  + Before:
      + printk_safe_flush_on_panic() will not touch logbuf_lock
	The messages will get flushed according to the state of
	logbuf_lock at the moment when it is being checked.

      + kmsg_dump(), console_unblank(), or console_flush_on_panic()
	will be able to do their job.

  + After:
      +  printk_safe_flush_on_panic(), kmsg_dump(), console_unblank(),
	 and console_flush_on_panic() could finish the job. But they
	 will get called _only_ when logbuf_lock is released at
	 the moment when it is being checked by printk_bust_lock_safe().


Resume:

From my POV, the 3rd scenario is the most likely one. Therefore this
patch would make more bad than good.

It might be possible to somehow detect if lockbuf_lock is released
from time the time on the non-stopped CPU. But it would be hairy.
IMHO, it is not worth it.

Thanks a lot for helping me to sort the ideas. I suggest to forget
this patch and work on lockless ringbuffer.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] printk/panic: Access the main printk log in panic() only when safe
  2019-07-19 12:57         ` Petr Mladek
@ 2019-07-23  3:13           ` Sergey Senozhatsky
  2019-07-24 12:27             ` Petr Mladek
  0 siblings, 1 reply; 17+ messages in thread
From: Sergey Senozhatsky @ 2019-07-23  3:13 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Konstantin Khlebnikov, Sergey Senozhatsky,
	Steven Rostedt, Peter Zijlstra, x86, John Ogness,
	Thomas Gleixner, Andrew Morton, Petr Tesarik, linux-kernel

On (07/19/19 14:57), Petr Mladek wrote:
[..]
> > Where do nested printk()-s come from? Which one of the following
> > scenarios you cover in commit message:
> > 
> > scenario 1
> > 
> > - we have CPUB which holds logbuf_lock
> > - we have CPUA which panic()-s the system, but can't bring CPUB down,
> >   so logbuf_lock stays locked on remote CPU
> 
> No, this scenario is not affected by this patch. It would always lead to
> a deadlock.

Agreed, in many cases we won't be able to panic() the system properly,
deadlocking somewhere in smp_send_stop().

> > scenario 2
> > 
> > - we have CPUA which holds logbuf_lock
> > - we have panic() on CPUA, but it cannot bring down some other CPUB
> >   so logbuf_lock stays locked on local CPU, and it cannot re-init
> >   logbuf.
[..]
>   + Before:
>       + printk_safe_flush_on_panic() will keep logbuf_lock locked
> 	and do nothing.
> 
>       + kmsg_dump(), console_unblank(), or console_flush_on_panic()
> 	will deadlock when they try to get logbuf_lock(). They will
> 	not be able to process any single line.
> 
>   + After:
>       + printk_bust_lock_safe() will keep logbuf_lock locked
> 
>       + All functions using logbuf_lock will not get called.
> 	We will not see the messages (as previously) but the
> 	system will not deadlock.
> 
> 
> But there is one more scenario 3:

Yes!

>   - we have CPUB which loops or is deadlocked in IRQ context
> 
>   - we have CPUA which panic()-s the system, but can't bring CPUB down,
>     so logbuf_lock might be takes and release from time to time
>     by CPUB

Great!

This is the only case when we actually need to pay attention to
num_online_cpus(), because there is an active logbuf_lock owner;
in any other case we can unconditionally re-init printk() locks.

But there is more to it.

Note, that the problem in scenario 3 is bigger than just logbuf_lock.
Regardless of logbuf implementation we will not be able to panic()
the system.

If we have a never ending source of printk() messages, coming from
misbehaving CPU which stuck in printing loop in IRQ context, then
flush_on_panic() will never end or kmsg dump will never stop, etc.
We need to cut off misbehaving CPUs. Panic CPU waits (for up to 1
second?) in smp_send_stop() for secondary CPUs to die, if some
secondary CPUs are still chatty after that then most likely those
CPUs don't have anything good to say, just a pointless flood of same
messages over and over again; which, however, will not let panic
CPU to proceed.

And this is where the idea of "disconnecting" those CPUs from main
logbuf come from.

So what we can do:
- smp_send_stop()
- disconnect all-but-self from logbuf (via printk-safe)
- wait for 1 or 2 more extra seconds for secondary CPUs to leave
  console_unlock() and to redirect printks to per-CPU buffers
- after that we are sort of good-to-go: re-init printk locks
  and do kmsg_dump, flush_on_panic().

Note, misbehaving CPUs will write to per-CPU buffers, they are not
expected to be able to flush per-CPU buffers to the main logbuf. That
will require enabled IRQs, which should deliver stop IPI. But we can
do even more - just disable print_safe irq work on disconnect CPUs.

So, shall we try one more time with the "disconnect" misbehaving CPUs
approach? I can send an RFC patch.

	-ss

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] printk/panic: Access the main printk log in panic() only when safe
  2019-07-23  3:13           ` Sergey Senozhatsky
@ 2019-07-24 12:27             ` Petr Mladek
  2019-07-31  6:08               ` Sergey Senozhatsky
  0 siblings, 1 reply; 17+ messages in thread
From: Petr Mladek @ 2019-07-24 12:27 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Konstantin Khlebnikov, Sergey Senozhatsky, Steven Rostedt,
	Peter Zijlstra, x86, John Ogness, Thomas Gleixner, Andrew Morton,
	Petr Tesarik, linux-kernel

On Tue 2019-07-23 12:13:40, Sergey Senozhatsky wrote:
> On (07/19/19 14:57), Petr Mladek wrote:
> > But there is one more scenario 3:
> 
> Yes!
> 
> >   - we have CPUB which loops or is deadlocked in IRQ context
> > 
> >   - we have CPUA which panic()-s the system, but can't bring CPUB down,
> >     so logbuf_lock might be takes and release from time to time
> >     by CPUB
> 
> Great!
> 
> This is the only case when we actually need to pay attention to
> num_online_cpus(), because there is an active logbuf_lock owner;
> in any other case we can unconditionally re-init printk() locks.
> 
> But there is more to it.
> 
> Note, that the problem in scenario 3 is bigger than just logbuf_lock.
> Regardless of logbuf implementation we will not be able to panic()
> the system.
> 
> If we have a never ending source of printk() messages, coming from
> misbehaving CPU which stuck in printing loop in IRQ context, then
> flush_on_panic() will never end or kmsg dump will never stop, etc.

Yes.

> We need to cut off misbehaving CPUs. Panic CPU waits (for up to 1
> second?) in smp_send_stop() for secondary CPUs to die, if some
> secondary CPUs are still chatty after that then most likely those
> CPUs don't have anything good to say, just a pointless flood of same
> messages over and over again; which, however, will not let panic
> CPU to proceed.

Makes sense.

> And this is where the idea of "disconnecting" those CPUs from main
> logbuf come from.
> 
> So what we can do:
> - smp_send_stop()
> - disconnect all-but-self from logbuf (via printk-safe)

printk_safe is not really necessary. As you wrote, nobody
is interested into messages from CPUs that are supposed
to be stopped.

It might be enough to set some global variable, for example,
with the CPU number that is calling panic() and is the only
one allowed to print messages from this point on.

It might even be used to force console lock owner to leave
the cycle immediately.

> - wait for 1 or 2 more extra seconds for secondary CPUs to leave
>   console_unlock() and to redirect printks to per-CPU buffers
> - after that we are sort of good-to-go: re-init printk locks
>   and do kmsg_dump, flush_on_panic().

Sounds good.

> Note, misbehaving CPUs will write to per-CPU buffers, they are not
> expected to be able to flush per-CPU buffers to the main logbuf. That
> will require enabled IRQs, which should deliver stop IPI. But we can
> do even more - just disable print_safe irq work on disconnect CPUs.
>
> So, shall we try one more time with the "disconnect" misbehaving CPUs
> approach? I can send an RFC patch.

IMHO, it will be acceptable only when it is reasonably simple and
straightforward. The panic() code is full of special hacks and
it is already complicated to follow all the twists.

Especially because this scenario came up from a theoretical
discussion. Note that my original, real bug report, was
with logbuf_lock NMI, enabled kdump notifiers, ...

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] printk/panic: Access the main printk log in panic() only when safe
  2019-07-24 12:27             ` Petr Mladek
@ 2019-07-31  6:08               ` Sergey Senozhatsky
  2019-07-31  6:59                 ` Sergey Senozhatsky
  0 siblings, 1 reply; 17+ messages in thread
From: Sergey Senozhatsky @ 2019-07-31  6:08 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Konstantin Khlebnikov, Sergey Senozhatsky,
	Steven Rostedt, Peter Zijlstra, x86, John Ogness,
	Thomas Gleixner, Andrew Morton, Petr Tesarik, linux-kernel

Sorry for the delayed response.

On (07/24/19 14:27), Petr Mladek wrote:
[..]
> > And this is where the idea of "disconnecting" those CPUs from main
> > logbuf come from.
> > 
> > So what we can do:
> > - smp_send_stop()
> > - disconnect all-but-self from logbuf (via printk-safe)
> 
> printk_safe is not really necessary. As you wrote, nobody
> is interested into messages from CPUs that are supposed
> to be stopped.

OK.
Doing it in printk.c makes it easier to handle console_owner/waiter,
which are not exported.

> It might be enough to set some global variable, for example,
> with the CPU number that is calling panic() and is the only
> one allowed to print messages from this point on.

Sounds good.

> It might even be used to force console lock owner to leave
> the cycle immediately.

Yes, makes sense.

[..]
> > So, shall we try one more time with the "disconnect" misbehaving CPUs
> > approach? I can send an RFC patch.
> 
> IMHO, it will be acceptable only when it is reasonably simple and
> straightforward. The panic() code is full of special hacks and
> it is already complicated to follow all the twists.

When you have a chance, mind to take a look at the patch below?
Doesn't look very difficult (half of it are white-spaces and
comments, I believe).

> Especially because this scenario came up from a theoretical
> discussion. Note that my original, real bug report, was
> with logbuf_lock NMI, enabled kdump notifiers, ...

I understand. The idea is that this patch should handle your real
scenario + theoretical scenario.

---
 include/linux/printk.h |  2 ++
 kernel/panic.c         |  9 ++++++-
 kernel/printk/printk.c | 53 ++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 63 insertions(+), 1 deletion(-)

diff --git a/include/linux/printk.h b/include/linux/printk.h
index 57c9473f4a81..016f5ba06e94 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -322,6 +322,8 @@ static inline void printk_safe_flush_on_panic(void)
 }
 #endif
 
+void printk_enter_panic_mode(int cpu);
+
 extern int kptr_restrict;
 
 #ifndef pr_fmt
diff --git a/kernel/panic.c b/kernel/panic.c
index d1ece4c363b9..85fac975a90f 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -254,13 +254,20 @@ void panic(const char *fmt, ...)
 		crash_smp_send_stop();
 	}
 
+	/* Misbehaving secondary CPUs cannot printk() to the main logbuf now */
+	printk_enter_panic_mode(this_cpu);
+
 	/*
 	 * Run any panic handlers, including those that might need to
 	 * add information to the kmsg dump output.
 	 */
 	atomic_notifier_call_chain(&panic_notifier_list, 0, buf);
 
-	/* Call flush even twice. It tries harder with a single online CPU */
+	/*
+	 * Call flush even twice. It tries harder with a single online CPU.
+	 * Even if we failed to stop some of secondary CPUs we have printk
+	 * locks re-initialized and keep secondary CPUs off printk().
+	 */
 	printk_safe_flush_on_panic();
 	kmsg_dump(KMSG_DUMP_PANIC);
 
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index f0bc37a511a7..750f83c3b589 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -87,6 +87,8 @@ static DEFINE_SEMAPHORE(console_sem);
 struct console *console_drivers;
 EXPORT_SYMBOL_GPL(console_drivers);
 
+static atomic_t __read_mostly printk_panic_cpu = ATOMIC_INIT(PANIC_CPU_INVALID);
+
 /*
  * System may need to suppress printk message under certain
  * circumstances, like after kernel panic happens.
@@ -1644,6 +1646,49 @@ static void console_lock_spinning_enable(void)
 	spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);
 }
 
+static int panic_in_progress_on_other_cpu(void)
+{
+	int cpu = atomic_read(&printk_panic_cpu);
+
+	return cpu != PANIC_CPU_INVALID && cpu != smp_processor_id();
+}
+
+void printk_enter_panic_mode(int cpu)
+{
+	unsigned long timeout;
+
+	cpu = atomic_cmpxchg(&printk_panic_cpu, PANIC_CPU_INVALID, cpu);
+	/* printk can enter panic mode only once */
+	if (cpu != PANIC_CPU_INVALID)
+		return;
+	/*
+	 * Wait for active secondary CPUs (if there are any) to leave
+	 * console_unlock() printing loop (for up to one second).
+	 */
+	if (num_online_cpus() > 1) {
+		timeout = USEC_PER_SEC;
+		while (num_online_cpus() > 1 && timeout--)
+			udelay(1);
+	}
+
+	debug_locks_off();
+	/*
+	 * On some platforms crash_smp_send_stop() can kill CPUs via NMI
+	 * vector. Re-init printk() locks just in case if any of those killed
+	 * CPUs held any of printk() locks. On platforms which don't support
+	 * NMI stop, misbehaving secondary CPUs will be handled by
+	 * panic_in_progress_on_other_cpu() test.
+	 *
+	 * We re-init only printk() locks here. oops_in_progress is expected
+	 * to be set by now, so good console drivers are in lockless mode,
+	 * bad console drivers, however, can deadlock.
+	 */
+	raw_spin_lock_init(&logbuf_lock);
+	sema_init(&console_sem, 1);
+	WRITE_ONCE(console_waiter, false);
+	raw_spin_lock_init(&console_owner_lock);
+}
+
 /**
  * console_lock_spinning_disable_and_check - mark end of code where another
  *	thread was able to busy wait and check if there is a waiter
@@ -1900,6 +1945,9 @@ int vprintk_store(int facility, int level,
 	size_t text_len;
 	enum log_flags lflags = 0;
 
+	if (panic_in_progress_on_other_cpu())
+		return 0;
+
 	/*
 	 * The printf needs to come first; we need the syslog
 	 * prefix which might be passed-in as a parameter.
@@ -2468,6 +2516,11 @@ void console_unlock(void)
 			return;
 		}
 
+		if (panic_in_progress_on_other_cpu()) {
+			printk_safe_exit_irqrestore(flags);
+			return;
+		}
+
 		printk_safe_exit_irqrestore(flags);
 
 		if (do_cond_resched)
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] printk/panic: Access the main printk log in panic() only when safe
  2019-07-31  6:08               ` Sergey Senozhatsky
@ 2019-07-31  6:59                 ` Sergey Senozhatsky
  0 siblings, 0 replies; 17+ messages in thread
From: Sergey Senozhatsky @ 2019-07-31  6:59 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Konstantin Khlebnikov, Sergey Senozhatsky, Steven Rostedt,
	Peter Zijlstra, x86, John Ogness, Thomas Gleixner, Andrew Morton,
	Petr Tesarik, linux-kernel, Sergey Senozhatsky

On (07/31/19 15:08), Sergey Senozhatsky wrote:
> When you have a chance, mind to take a look at the patch below?
> Doesn't look very difficult (half of it are white-spaces and
> comments, I believe).

I'm very sorry for annoyance.

Updated version:
-- passes !PRINTK build
-- moved WRITE_ONCE(console_waiter, false) in printk_enter_panic_mode()
-- added panic_in_progress_on_other_cpu() to console_trylock_spinning()

No more updates this week. Will wait for feedback.

Once again, sorry!

---
 include/linux/printk.h |  5 ++++
 kernel/panic.c         |  9 +++++-
 kernel/printk/printk.c | 64 ++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 77 insertions(+), 1 deletion(-)

diff --git a/include/linux/printk.h b/include/linux/printk.h
index 57c9473f4a81..8293156d8243 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -245,6 +245,7 @@ extern asmlinkage void dump_stack(void) __cold;
 extern void printk_safe_init(void);
 extern void printk_safe_flush(void);
 extern void printk_safe_flush_on_panic(void);
+extern void printk_enter_panic_mode(int cpu);
 #else
 static inline __printf(1, 0)
 int vprintk(const char *s, va_list args)
@@ -320,6 +321,10 @@ static inline void printk_safe_flush(void)
 static inline void printk_safe_flush_on_panic(void)
 {
 }
+
+static inline void printk_enter_panic_mode(int cpu)
+{
+}
 #endif
 
 extern int kptr_restrict;
diff --git a/kernel/panic.c b/kernel/panic.c
index d1ece4c363b9..85fac975a90f 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -254,13 +254,20 @@ void panic(const char *fmt, ...)
 		crash_smp_send_stop();
 	}
 
+	/* Misbehaving secondary CPUs cannot printk() to the main logbuf now */
+	printk_enter_panic_mode(this_cpu);
+
 	/*
 	 * Run any panic handlers, including those that might need to
 	 * add information to the kmsg dump output.
 	 */
 	atomic_notifier_call_chain(&panic_notifier_list, 0, buf);
 
-	/* Call flush even twice. It tries harder with a single online CPU */
+	/*
+	 * Call flush even twice. It tries harder with a single online CPU.
+	 * Even if we failed to stop some of secondary CPUs we have printk
+	 * locks re-initialized and keep secondary CPUs off printk().
+	 */
 	printk_safe_flush_on_panic();
 	kmsg_dump(KMSG_DUMP_PANIC);
 
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index f0bc37a511a7..cd51aa7d08a9 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1625,6 +1625,57 @@ static DEFINE_RAW_SPINLOCK(console_owner_lock);
 static struct task_struct *console_owner;
 static bool console_waiter;
 
+/*
+ * When system is in panic() this is used to permit printk() calls only from
+ * panic_cpu.
+ */
+static atomic_t __read_mostly printk_panic_cpu = ATOMIC_INIT(PANIC_CPU_INVALID);
+
+static int panic_in_progress_on_other_cpu(void)
+{
+	int cpu = atomic_read(&printk_panic_cpu);
+
+	return cpu != PANIC_CPU_INVALID && cpu != smp_processor_id();
+}
+
+void printk_enter_panic_mode(int cpu)
+{
+	unsigned long timeout;
+
+	cpu = atomic_cmpxchg(&printk_panic_cpu, PANIC_CPU_INVALID, cpu);
+	/* printk can enter panic mode only once */
+	if (cpu != PANIC_CPU_INVALID)
+		return;
+
+	WRITE_ONCE(console_waiter, false);
+
+	/*
+	 * Wait for active secondary CPUs (if there are any) to leave
+	 * console_unlock() printing loop (for up to one second).
+	 */
+	if (num_online_cpus() > 1) {
+		timeout = USEC_PER_SEC;
+		while (num_online_cpus() > 1 && timeout--)
+			udelay(1);
+	}
+
+	debug_locks_off();
+	/*
+	 * On some platforms crash_smp_send_stop() can kill CPUs via NMI
+	 * vector. Re-init printk() locks just in case if any of those killed
+	 * CPUs held any of printk() locks. On platforms which don't support
+	 * NMI stop, misbehaving secondary CPUs will be handled by
+	 * panic_in_progress_on_other_cpu() test.
+	 *
+	 * We re-init only printk() locks here. oops_in_progress is expected
+	 * to be set by now, so good console drivers are in lockless mode,
+	 * bad console drivers, however, can deadlock.
+	 */
+	raw_spin_lock_init(&logbuf_lock);
+	sema_init(&console_sem, 1);
+	raw_spin_lock_init(&console_owner_lock);
+}
+
 /**
  * console_lock_spinning_enable - mark beginning of code where another
  *	thread might safely busy wait
@@ -1739,6 +1790,10 @@ static int console_trylock_spinning(void)
 	spin_release(&console_owner_dep_map, 1, _THIS_IP_);
 
 	printk_safe_exit_irqrestore(flags);
+
+	if (panic_in_progress_on_other_cpu())
+		return 0;
+
 	/*
 	 * The owner passed the console lock to us.
 	 * Since we did not spin on console lock, annotate
@@ -1900,6 +1955,9 @@ int vprintk_store(int facility, int level,
 	size_t text_len;
 	enum log_flags lflags = 0;
 
+	if (panic_in_progress_on_other_cpu())
+		return 0;
+
 	/*
 	 * The printf needs to come first; we need the syslog
 	 * prefix which might be passed-in as a parameter.
@@ -2076,6 +2134,7 @@ static ssize_t msg_print_ext_body(char *buf, size_t size,
 				  char *text, size_t text_len) { return 0; }
 static void console_lock_spinning_enable(void) { }
 static int console_lock_spinning_disable_and_check(void) { return 0; }
+static int panic_in_progress_on_other_cpu(void) { return 0; }
 static void call_console_drivers(const char *ext_text, size_t ext_len,
 				 const char *text, size_t len) {}
 static size_t msg_print_text(const struct printk_log *msg, bool syslog,
@@ -2468,6 +2527,11 @@ void console_unlock(void)
 			return;
 		}
 
+		if (panic_in_progress_on_other_cpu()) {
+			printk_safe_exit_irqrestore(flags);
+			return;
+		}
+
 		printk_safe_exit_irqrestore(flags);
 
 		if (do_cond_resched)
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] printk/panic: Access the main printk log in panic() only when safe
  2019-07-16  7:28 ` [PATCH 1/2] printk/panic: Access the main printk log in panic() only when safe Petr Mladek
  2019-07-17  9:56   ` Sergey Senozhatsky
@ 2019-07-18 10:11   ` Sergey Senozhatsky
  1 sibling, 0 replies; 17+ messages in thread
From: Sergey Senozhatsky @ 2019-07-18 10:11 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Steven Rostedt, Peter Zijlstra,
	Andrew Morton, Thomas Gleixner, John Ogness, Petr Tesarik,
	Konstantin Khlebnikov, x86, linux-kernel

On (07/16/19 09:28), Petr Mladek wrote:
[..]
> +int printk_bust_lock_safe(void)
> +{
> +	if (!raw_spin_is_locked(&logbuf_lock))
> +		return 0;
> +
> +	if (num_online_cpus() == 1) {
> +		debug_locks_off();
> +		raw_spin_lock_init(&logbuf_lock);
> +		return 0;
> +	}
> +
> +	return -EWOULDBLOCK;
> +}

A side note:

I'd say that we also need to clear console_owner and re-init
console_owner_lock spin_lock, panic CPU can spin forever
otherwise; and I think it would also be reasonable to re-init
console_sem's spin_lock, yet another lock on which panic CPU
can spin forever. (Assuming that one of remote CPUs that we
have NMI killed had owned any of those locks).

Console drivers' locks are mostly OK, as long as drivers check
oops_in_progress in ->write() path and act accordingly, but
printk() has to deal with more than one lock (logbuf_lock)
before it invokes console drivers.

	-ss

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 2/2] printk/panic/x86: Allow to access printk log buffer after crash_smp_send_stop()
  2019-07-16  7:28 [PATCH 0/2] panic/printk/x86: Prevent some more printk-related deadlocks in panic() Petr Mladek
  2019-07-16  7:28 ` [PATCH 1/2] printk/panic: Access the main printk log in panic() only when safe Petr Mladek
@ 2019-07-16  7:28 ` Petr Mladek
  2019-07-18 10:47   ` Sergey Senozhatsky
  2019-07-18  9:59 ` [PATCH 0/2] panic/printk/x86: Prevent some more printk-related deadlocks in panic() Konstantin Khlebnikov
  2 siblings, 1 reply; 17+ messages in thread
From: Petr Mladek @ 2019-07-16  7:28 UTC (permalink / raw)
  To: Sergey Senozhatsky, Steven Rostedt, Peter Zijlstra, Andrew Morton
  Cc: Thomas Gleixner, John Ogness, Petr Tesarik,
	Konstantin Khlebnikov, x86, linux-kernel, Petr Mladek

crash_smp_send_stop() is a special variant of smp_send_stop(). It is
used when crash_kexec_post_notifiers are enabled. CPUs are stopped
but cpu_online_mask is not updated so that the original information is
visible in crashdump. See the commit 0ee59413c967c35a6dd ("x86/panic:
replace smp_send_stop() with kdump friendly version in panic path")
for more details.

crash_smp_send_stop() uses NMI to stop the CPUs. Then logbuf_lock
might stay locked but printk_bust_lock_safe() does not know that
CPUs are stopped.

Solution is to force logbuf_lock re-initialization from
crash_smp_send_stop().

Note that x86 seems to be the only architecture that implements
crash_smp_send_stop() and uses NMI at the same time. Other
architectures could not guarantee that the CPUs were really stopped.

Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 arch/x86/kernel/crash.c     | 6 +++++-
 include/linux/printk.h      | 4 ++--
 kernel/panic.c              | 2 +-
 kernel/printk/printk_safe.c | 8 +++++---
 4 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 2bf70a2fed90..9a497eb37bf7 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -99,7 +99,11 @@ static void kdump_nmi_callback(int cpu, struct pt_regs *regs)
 void kdump_nmi_shootdown_cpus(void)
 {
 	nmi_shootdown_cpus(kdump_nmi_callback);
-
+	/*
+	 * CPUs are stopped but it is not visible via cpus_online
+	 * bitmap. Bust logbuf_lock to make kmsg_dump() working.
+	 */
+	printk_bust_lock_safe(true);
 	disable_local_APIC();
 }
 
diff --git a/include/linux/printk.h b/include/linux/printk.h
index 4d15a0eda9c6..c050f1dafc32 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -202,7 +202,7 @@ __printf(1, 2) void dump_stack_set_arch_desc(const char *fmt, ...);
 void dump_stack_print_info(const char *log_lvl);
 void show_regs_print_info(const char *log_lvl);
 extern asmlinkage void dump_stack(void) __cold;
-extern int printk_bust_lock_safe(void);
+extern int printk_bust_lock_safe(bool kdump_smp_stop);
 extern void printk_safe_init(void);
 extern void printk_safe_flush(void);
 extern void printk_safe_flush_on_panic(void);
@@ -270,7 +270,7 @@ static inline void dump_stack(void)
 {
 }
 
-static inline int printk_bust_lock_safe(void)
+static inline int printk_bust_lock_safe(bool kdump_smp_stop)
 {
 	return 0;
 }
diff --git a/kernel/panic.c b/kernel/panic.c
index aa50cdb75022..54fae99e7a7e 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -274,7 +274,7 @@ void panic(const char *fmt, ...)
 	atomic_notifier_call_chain(&panic_notifier_list, 0, buf);
 
 	/* Call flush even twice. It tries harder with a single online CPU */
-	printk_blocked = printk_bust_lock_safe();
+	printk_blocked = printk_bust_lock_safe(false);
 	if (!printk_blocked) {
 		printk_safe_flush_on_panic();
 		kmsg_dump(KMSG_DUMP_PANIC);
diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
index 71d4b763f811..e26304277886 100644
--- a/kernel/printk/printk_safe.c
+++ b/kernel/printk/printk_safe.c
@@ -255,16 +255,18 @@ void printk_safe_flush(void)
 
 /**
  * printk_try_bust_lock - make printk log accessible when safe
+ * @kdump_smp_stop: true when called after kdump stopped CPUs via NMI
+ *	but did not update the number of online CPUs.
  *
  * Return 0 when the log is accessible. Return -EWOULDBLOCK when
  * it is not safe and likely to cause a deadlock.
  */
-int printk_bust_lock_safe(void)
+int printk_bust_lock_safe(bool kdump_smp_stop)
 {
 	if (!raw_spin_is_locked(&logbuf_lock))
 		return 0;
 
-	if (num_online_cpus() == 1) {
+	if (num_online_cpus() == 1 || kdump_smp_stop)  {
 		debug_locks_off();
 		raw_spin_lock_init(&logbuf_lock);
 		return 0;
@@ -285,7 +287,7 @@ int printk_bust_lock_safe(void)
  */
 void printk_safe_flush_on_panic(void)
 {
-	if (printk_bust_lock_safe() == 0)
+	if (printk_bust_lock_safe(false) == 0)
 		printk_safe_flush();
 }
 
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] printk/panic/x86: Allow to access printk log buffer after crash_smp_send_stop()
  2019-07-16  7:28 ` [PATCH 2/2] printk/panic/x86: Allow to access printk log buffer after crash_smp_send_stop() Petr Mladek
@ 2019-07-18 10:47   ` Sergey Senozhatsky
  2019-07-18 11:07     ` Konstantin Khlebnikov
  0 siblings, 1 reply; 17+ messages in thread
From: Sergey Senozhatsky @ 2019-07-18 10:47 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Steven Rostedt, Peter Zijlstra,
	Andrew Morton, Thomas Gleixner, John Ogness, Petr Tesarik,
	Konstantin Khlebnikov, x86, linux-kernel

On (07/16/19 09:28), Petr Mladek wrote:
[..]
> +int printk_bust_lock_safe(bool kdump_smp_stop)
>  {
>  	if (!raw_spin_is_locked(&logbuf_lock))
>  		return 0;
>  
> -	if (num_online_cpus() == 1) {
> +	if (num_online_cpus() == 1 || kdump_smp_stop)  {
>  		debug_locks_off();
>  		raw_spin_lock_init(&logbuf_lock);
>  		return 0;

Let me test the waters. Criticize the following idea:

Can we, sort of, disconnect "supposed to be dead" CPUs from printk()
so then we can unconditionally re-init printk() from panic-CPU?

We have per-CPU printk_state; so panic-CPU can set, let's say,
DEAD_CPUS_TELL_NO_TALES bit on all CPUs but self, and vprintk_func()
will do nothing if DEAD_CPUS_TELL_NO_TALES bit set on particular
CPU. Foreign CPUs are not even supposed to be alive, and smp_send_stop()
waits for IPI acks from secondary CPUs long enough on average (need
to check that) so if one of the CPUs is misbehaving and doesn't want
to die (geez...) we will just "disconnect" it from printk() to minimize
possible logbuf/console drivers interventions and then proceed with
panic; assuming that misbehaving CPUs are actually up to something
sane. Sometimes, you know, in some cases, those CPUs are already dead:
either accidentally powered off, or went completely nuts and do nothing,
etc. etc. but we still can kdump() and console_flush_on_panic().

	-ss

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] printk/panic/x86: Allow to access printk log buffer after crash_smp_send_stop()
  2019-07-18 10:47   ` Sergey Senozhatsky
@ 2019-07-18 11:07     ` Konstantin Khlebnikov
  2019-07-18 11:29       ` Sergey Senozhatsky
  0 siblings, 1 reply; 17+ messages in thread
From: Konstantin Khlebnikov @ 2019-07-18 11:07 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Sergey Senozhatsky, Steven Rostedt, Peter Zijlstra,
	Andrew Morton, Thomas Gleixner, John Ogness, Petr Tesarik, x86,
	Linux Kernel Mailing List

On Thu, Jul 18, 2019 at 1:48 PM Sergey Senozhatsky
<sergey.senozhatsky.work@gmail.com> wrote:
>
> On (07/16/19 09:28), Petr Mladek wrote:
> [..]
> > +int printk_bust_lock_safe(bool kdump_smp_stop)
> >  {
> >       if (!raw_spin_is_locked(&logbuf_lock))
> >               return 0;
> >
> > -     if (num_online_cpus() == 1) {
> > +     if (num_online_cpus() == 1 || kdump_smp_stop)  {
> >               debug_locks_off();
> >               raw_spin_lock_init(&logbuf_lock);
> >               return 0;
>
> Let me test the waters. Criticize the following idea:
>
> Can we, sort of, disconnect "supposed to be dead" CPUs from printk()
> so then we can unconditionally re-init printk() from panic-CPU?
>
> We have per-CPU printk_state; so panic-CPU can set, let's say,
> DEAD_CPUS_TELL_NO_TALES bit on all CPUs but self, and vprintk_func()
> will do nothing if DEAD_CPUS_TELL_NO_TALES bit set on particular
> CPU. Foreign CPUs are not even supposed to be alive, and smp_send_stop()
> waits for IPI acks from secondary CPUs long enough on average (need
> to check that) so if one of the CPUs is misbehaving and doesn't want
> to die (geez...) we will just "disconnect" it from printk() to minimize
> possible logbuf/console drivers interventions and then proceed with
> panic; assuming that misbehaving CPUs are actually up to something
> sane. Sometimes, you know, in some cases, those CPUs are already dead:
> either accidentally powered off, or went completely nuts and do nothing,
> etc. etc. but we still can kdump() and console_flush_on_panic().

Good idea.
Panic-CPU could just increment state to reroute printk into 'safe'
per-cpu buffer.

>
>         -ss

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] printk/panic/x86: Allow to access printk log buffer after crash_smp_send_stop()
  2019-07-18 11:07     ` Konstantin Khlebnikov
@ 2019-07-18 11:29       ` Sergey Senozhatsky
  2019-07-19 12:19         ` Petr Mladek
  0 siblings, 1 reply; 17+ messages in thread
From: Sergey Senozhatsky @ 2019-07-18 11:29 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: Sergey Senozhatsky, Petr Mladek, Sergey Senozhatsky,
	Steven Rostedt, Peter Zijlstra, Andrew Morton, Thomas Gleixner,
	John Ogness, Petr Tesarik, x86, Linux Kernel Mailing List

On (07/18/19 14:07), Konstantin Khlebnikov wrote:
> > Let me test the waters. Criticize the following idea:
> >
> > Can we, sort of, disconnect "supposed to be dead" CPUs from printk()
> > so then we can unconditionally re-init printk() from panic-CPU?
> >
> > We have per-CPU printk_state; so panic-CPU can set, let's say,
> > DEAD_CPUS_TELL_NO_TALES bit on all CPUs but self, and vprintk_func()
> > will do nothing if DEAD_CPUS_TELL_NO_TALES bit set on particular
> > CPU. Foreign CPUs are not even supposed to be alive, and smp_send_stop()
> > waits for IPI acks from secondary CPUs long enough on average (need
> > to check that) so if one of the CPUs is misbehaving and doesn't want
> > to die (geez...) we will just "disconnect" it from printk() to minimize
> > possible logbuf/console drivers interventions and then proceed with
> > panic; assuming that misbehaving CPUs are actually up to something
> > sane. Sometimes, you know, in some cases, those CPUs are already dead:
> > either accidentally powered off, or went completely nuts and do nothing,
> > etc. etc. but we still can kdump() and console_flush_on_panic().
> 
> Good idea.
> Panic-CPU could just increment state to reroute printk into 'safe'
> per-cpu buffer.

Yeah, that's better.

So we can do something like this

@@ -269,15 +269,21 @@ void printk_safe_flush_on_panic(void)
 	 * Make sure that we could access the main ring buffer.
 	 * Do not risk a double release when more CPUs are up.
 	 */
-	if (raw_spin_is_locked(&logbuf_lock)) {
-		if (num_online_cpus() > 1)
-			return;
+	debug_locks_off();
+	raw_spin_lock_init(&logbuf_lock);
+	/* + re-init the rest of printk() locks */
+	printk_safe_flush();
+}

[..]

+void printk_switch_to_panic_mode(int panic_cpu)
+{
+	int cpu;
 
+	for_each_possible_cpu(cpu) {
+		if (cpu == panic_cpu)
+			continue;
+		per_cpu(printk_context, cpu) = 42;
+	}
 }

And call printk_switch_to_panic_mode() from panic(). And we don't
need to touch arch code (it also covers the case when some new ARCH
will gain NMI support).

	-ss

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] printk/panic/x86: Allow to access printk log buffer after crash_smp_send_stop()
  2019-07-18 11:29       ` Sergey Senozhatsky
@ 2019-07-19 12:19         ` Petr Mladek
  0 siblings, 0 replies; 17+ messages in thread
From: Petr Mladek @ 2019-07-19 12:19 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Konstantin Khlebnikov, Sergey Senozhatsky, Steven Rostedt,
	Peter Zijlstra, x86, John Ogness, Thomas Gleixner, Andrew Morton,
	Petr Tesarik, Linux Kernel Mailing List

On Thu 2019-07-18 20:29:54, Sergey Senozhatsky wrote:
> On (07/18/19 14:07), Konstantin Khlebnikov wrote:
> > > Let me test the waters. Criticize the following idea:
> > >
> > > Can we, sort of, disconnect "supposed to be dead" CPUs from printk()
> > > so then we can unconditionally re-init printk() from panic-CPU?
> > >
> > > We have per-CPU printk_state; so panic-CPU can set, let's say,
> > > DEAD_CPUS_TELL_NO_TALES bit on all CPUs but self, and vprintk_func()
> > > will do nothing if DEAD_CPUS_TELL_NO_TALES bit set on particular
> > > CPU. Foreign CPUs are not even supposed to be alive, and smp_send_stop()
> > > waits for IPI acks from secondary CPUs long enough on average (need
> > > to check that) so if one of the CPUs is misbehaving and doesn't want
> > > to die (geez...) we will just "disconnect" it from printk() to minimize
> > > possible logbuf/console drivers interventions and then proceed with
> > > panic; assuming that misbehaving CPUs are actually up to something
> > > sane. Sometimes, you know, in some cases, those CPUs are already dead:
> > > either accidentally powered off, or went completely nuts and do nothing,
> > > etc. etc. but we still can kdump() and console_flush_on_panic().
> > 
> > Good idea.
> > Panic-CPU could just increment state to reroute printk into 'safe'
> > per-cpu buffer.
> 
> Yeah, that's better.
> 
> So we can do something like this
> 
> @@ -269,15 +269,21 @@ void printk_safe_flush_on_panic(void)
>  	 * Make sure that we could access the main ring buffer.
>  	 * Do not risk a double release when more CPUs are up.
>  	 */
> -	if (raw_spin_is_locked(&logbuf_lock)) {
> -		if (num_online_cpus() > 1)
> -			return;
> +	debug_locks_off();
> +	raw_spin_lock_init(&logbuf_lock);

Hmm, the check for num_online_cpus() is there to avoid deadlock
caused be double unlock. The unconditional init would bring it back.

It is about what compromise we use. We either try to get the messages
out, init the lock and risk a deadlock. Or we do not risk the
deadlock.

The current code is not consistent. printk_safe_flush_on_panic()
is conservative and does not risk the deadlock. While kmsg_dump(),
console_unblank() and console_flush_on_panic() risk the deadlock.

The 1st patch tries to make it more consistent. It makes all
the above functions as conservative as printk_safe_flush_on_panic()
regarding logbuf_lock. While they still risk a deadlock with
console-specific locks.

The reasoning might be:

  + The code under logbuf_lock mostly just manipulates strings. There
    are no nested locks. Infinite loops would most likely happen also
    during normal operation. By other words, it is not easy to stay
    locked in logbuf_lock (he, he, the last famous words).

  + On the other hand, a lot of code is done with disabled interrupts.
    It is easier to imagine that a CPU could not get stopped by normal
    interrupt because some deadlock or infinite loop in interrupt
    context. It might even be a code calling console_unlock().
    Re-initializing logbuf_lock might be unnecessary risk.

  + We are probably more relaxed with console specific locks because
    most of them are ignored in panic after bust_spinlocks().


Anyway, this patchset is primary about logbuf_lock related deadlocks.
I needed a fix for our customers. I do not want to spend too much
time on upstreaming. IMHO, it is better to invest into the lockless
ring buffer.

If we could agree that the patches makes things better (more
consistent, more safe in kdump with enabled notifiers) and
the complexity is acceptable. Then let's accept them
(with some trivial improvements).

If they are too controversial, risky, or complex then
let's abandon them and concentrate on lockless ring buffer.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/2] panic/printk/x86: Prevent some more printk-related deadlocks in panic()
  2019-07-16  7:28 [PATCH 0/2] panic/printk/x86: Prevent some more printk-related deadlocks in panic() Petr Mladek
  2019-07-16  7:28 ` [PATCH 1/2] printk/panic: Access the main printk log in panic() only when safe Petr Mladek
  2019-07-16  7:28 ` [PATCH 2/2] printk/panic/x86: Allow to access printk log buffer after crash_smp_send_stop() Petr Mladek
@ 2019-07-18  9:59 ` Konstantin Khlebnikov
  2 siblings, 0 replies; 17+ messages in thread
From: Konstantin Khlebnikov @ 2019-07-18  9:59 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Steven Rostedt, Peter Zijlstra,
	Andrew Morton, Thomas Gleixner, John Ogness, Petr Tesarik, x86,
	Linux Kernel Mailing List

On Tue, Jul 16, 2019 at 10:28 AM Petr Mladek <pmladek@suse.com> wrote:
>
> Hi,
>
> I have found some spare duct tape and wrapped some more printk-related
> deadlocks in panic().
>
> More seriously, someone reported a deadlock in panic(). Some non-trivial
> debugging pointed out a problem with the following combination:
>
>      + x86_64 architecture
>      + panic()
>      + pstore configured as message dumper (kmsg_dump())
>      + crash kernel configured
>      + crash_kexec_post_notifiers
>
> In this case, CPUs are stopped by crash_smp_send_stop(). It uses
> NMI but it does not modify cpu_online_mask. Therefore logbuf_lock
> might stay locked, see 2nd patch for more details.
>
> The above is a real corner case. But similar problem seems to be
> even in the common situations on architectures that do not use
> NMI in smp_send_stop() as a fallback, see 1st patch.
>
> Back to the duct tape. I hope that we will get rid of these problems
> with the lockless printk ringbuffer rather sooner than later.
> But it still might take some time. And the two fixes might be
> useful also for stable kernels.

Looks good.

Reviewed-by: Konstantin Khebnikov <khlebnikov@yandex-team.ru>

>
>
> Petr Mladek (2):
>   printk/panic: Access the main printk log in panic() only when safe
>   printk/panic/x86: Allow to access printk log buffer after
>     crash_smp_send_stop()
>
>  arch/x86/kernel/crash.c     |  6 +++++-
>  include/linux/printk.h      |  6 ++++++
>  kernel/panic.c              | 49 +++++++++++++++++++++++++++------------------
>  kernel/printk/printk_safe.c | 37 ++++++++++++++++++++++------------
>  4 files changed, 65 insertions(+), 33 deletions(-)
>
> --
> 2.16.4
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2019-07-31  6:59 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-16  7:28 [PATCH 0/2] panic/printk/x86: Prevent some more printk-related deadlocks in panic() Petr Mladek
2019-07-16  7:28 ` [PATCH 1/2] printk/panic: Access the main printk log in panic() only when safe Petr Mladek
2019-07-17  9:56   ` Sergey Senozhatsky
2019-07-18  8:36     ` Petr Mladek
2019-07-18  9:49       ` Sergey Senozhatsky
2019-07-19 12:57         ` Petr Mladek
2019-07-23  3:13           ` Sergey Senozhatsky
2019-07-24 12:27             ` Petr Mladek
2019-07-31  6:08               ` Sergey Senozhatsky
2019-07-31  6:59                 ` Sergey Senozhatsky
2019-07-18 10:11   ` Sergey Senozhatsky
2019-07-16  7:28 ` [PATCH 2/2] printk/panic/x86: Allow to access printk log buffer after crash_smp_send_stop() Petr Mladek
2019-07-18 10:47   ` Sergey Senozhatsky
2019-07-18 11:07     ` Konstantin Khlebnikov
2019-07-18 11:29       ` Sergey Senozhatsky
2019-07-19 12:19         ` Petr Mladek
2019-07-18  9:59 ` [PATCH 0/2] panic/printk/x86: Prevent some more printk-related deadlocks in panic() Konstantin Khlebnikov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).