All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Muchun Song <songmuchun@bytedance.com>,
	Petr Mladek <pmladek@suse.com>,
	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Subject: [PATCH 4.9 13/41] printk: fix deadlock when kernel panic
Date: Fri,  5 Mar 2021 13:22:20 +0100	[thread overview]
Message-ID: <20210305120851.935800288@linuxfoundation.org> (raw)
In-Reply-To: <20210305120851.255002428@linuxfoundation.org>

From: Muchun Song <songmuchun@bytedance.com>

commit 8a8109f303e25a27f92c1d8edd67d7cbbc60a4eb upstream.

printk_safe_flush_on_panic() caused the following deadlock on our
server:

CPU0:                                         CPU1:
panic                                         rcu_dump_cpu_stacks
  kdump_nmi_shootdown_cpus                      nmi_trigger_cpumask_backtrace
    register_nmi_handler(crash_nmi_callback)      printk_safe_flush
                                                    __printk_safe_flush
                                                      raw_spin_lock_irqsave(&read_lock)
    // send NMI to other processors
    apic_send_IPI_allbutself(NMI_VECTOR)
                                                        // NMI interrupt, dead loop
                                                        crash_nmi_callback
  printk_safe_flush_on_panic
    printk_safe_flush
      __printk_safe_flush
        // deadlock
        raw_spin_lock_irqsave(&read_lock)

DEADLOCK: read_lock is taken on CPU1 and will never get released.

It happens when panic() stops a CPU by NMI while it has been in
the middle of printk_safe_flush().

Handle the lock the same way as logbuf_lock. The printk_safe buffers
are flushed only when both locks can be safely taken. It can avoid
the deadlock _in this particular case_ at expense of losing contents
of printk_safe buffers.

Note: It would actually be safe to re-init the locks when all CPUs were
      stopped by NMI. But it would require passing this information
      from arch-specific code. It is not worth the complexity.
      Especially because logbuf_lock and printk_safe buffers have been
      obsoleted by the lockless ring buffer.

Fixes: cf9b1106c81c ("printk/nmi: flush NMI messages on the system panic")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: <stable@vger.kernel.org>
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20210210034823.64867-1-songmuchun@bytedance.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 kernel/printk/nmi.c |   16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

--- a/kernel/printk/nmi.c
+++ b/kernel/printk/nmi.c
@@ -52,6 +52,8 @@ struct nmi_seq_buf {
 };
 static DEFINE_PER_CPU(struct nmi_seq_buf, nmi_print_seq);
 
+static DEFINE_RAW_SPINLOCK(nmi_read_lock);
+
 /*
  * Safe printk() for NMI context. It uses a per-CPU buffer to
  * store the message. NMIs are not nested, so there is always only
@@ -134,8 +136,6 @@ static void printk_nmi_flush_seq_line(st
  */
 static void __printk_nmi_flush(struct irq_work *work)
 {
-	static raw_spinlock_t read_lock =
-		__RAW_SPIN_LOCK_INITIALIZER(read_lock);
 	struct nmi_seq_buf *s = container_of(work, struct nmi_seq_buf, work);
 	unsigned long flags;
 	size_t len, size;
@@ -148,7 +148,7 @@ static void __printk_nmi_flush(struct ir
 	 * different CPUs. This is especially important when printing
 	 * a backtrace.
 	 */
-	raw_spin_lock_irqsave(&read_lock, flags);
+	raw_spin_lock_irqsave(&nmi_read_lock, flags);
 
 	i = 0;
 more:
@@ -197,7 +197,7 @@ more:
 		goto more;
 
 out:
-	raw_spin_unlock_irqrestore(&read_lock, flags);
+	raw_spin_unlock_irqrestore(&nmi_read_lock, flags);
 }
 
 /**
@@ -239,6 +239,14 @@ void printk_nmi_flush_on_panic(void)
 		raw_spin_lock_init(&logbuf_lock);
 	}
 
+	if (in_nmi() && raw_spin_is_locked(&nmi_read_lock)) {
+		if (num_online_cpus() > 1)
+			return;
+
+		debug_locks_off();
+		raw_spin_lock_init(&nmi_read_lock);
+	}
+
 	printk_nmi_flush();
 }
 



  parent reply	other threads:[~2021-03-05 12:43 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-05 12:22 [PATCH 4.9 00/41] 4.9.260-rc1 review Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 01/41] futex: Cleanup variable names for futex_top_waiter() Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 02/41] futex: Cleanup refcounting Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 03/41] futex: Pull rt_mutex_futex_unlock() out from under hb->lock Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 04/41] futex: Futex_unlock_pi() determinism Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 05/41] futex: Fix pi_state->owner serialization Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 06/41] futex: Fix more put_pi_state() vs. exit_pi_state_list() races Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 07/41] futex: Dont enable IRQs unconditionally in put_pi_state() Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 08/41] net: usb: qmi_wwan: support ZTE P685M modem Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 09/41] arm: kprobes: Allow to handle reentered kprobe on single-stepping Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 10/41] scripts: use pkg-config to locate libcrypto Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 11/41] scripts: set proper OpenSSL include dir also for sign-file Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 12/41] hugetlb: fix update_and_free_page contig page struct assumption Greg Kroah-Hartman
2021-03-05 12:22 ` Greg Kroah-Hartman [this message]
2021-03-05 12:22 ` [PATCH 4.9 14/41] arm64: Remove redundant mov from LL/SC cmpxchg Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 15/41] arm64: Avoid redundant type conversions in xchg() and cmpxchg() Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 16/41] arm64: cmpxchg: Use "K" instead of "L" for ll/sc immediate constraint Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 17/41] arm64: Use correct ll/sc atomic constraints Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 18/41] JFS: more checks for invalid superblock Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 19/41] xfs: Fix assert failure in xfs_setattr_size() Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 20/41] smackfs: restrict bytes count in smackfs write functions Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 21/41] net: fix up truesize of cloned skb in skb_prepare_for_shift() Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 22/41] mm/hugetlb.c: fix unnecessary address expansion of pmd sharing Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 23/41] staging: fwserial: Fix error handling in fwserial_create Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 24/41] x86/reboot: Add Zotac ZBOX CI327 nano PCI reboot quirk Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 25/41] vt/consolemap: do font sum unsigned Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 26/41] wlcore: Fix command execute failure 19 for wl12xx Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 27/41] pktgen: fix misuse of BUG_ON() in pktgen_thread_worker() Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 28/41] ath10k: fix wmi mgmt tx queue full due to race condition Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 29/41] x86/build: Treat R_386_PLT32 relocation as R_386_PC32 Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 30/41] Bluetooth: Fix null pointer dereference in amp_read_loc_assoc_final_data Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 31/41] staging: most: sound: add sanity check for function argument Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 32/41] media: uvcvideo: Allow entities with no pads Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 33/41] scsi: iscsi: Restrict sessions and handles to admin capabilities Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 34/41] sysfs: Add sysfs_emit and sysfs_emit_at to format sysfs output Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 35/41] scsi: iscsi: Ensure sysfs attributes are limited to PAGE_SIZE Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 36/41] scsi: iscsi: Verify lengths on passthrough PDUs Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 37/41] Xen/gnttab: handle p2m update errors on a per-slot basis Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 38/41] xen-netback: respect gnttab_map_refs()s return value Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 39/41] zsmalloc: account the number of compacted pages correctly Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 40/41] swap: fix swapfile read/write offset Greg Kroah-Hartman
2021-03-05 12:22 ` [PATCH 4.9 41/41] media: v4l: ioctl: Fix memory leak in video_usercopy Greg Kroah-Hartman
2021-03-05 17:51 ` [PATCH 4.9 00/41] 4.9.260-rc1 review Jon Hunter
2021-03-06  5:16 ` Florian Fainelli
2021-03-06 16:29 ` Guenter Roeck
2021-03-06 16:30 ` Guenter Roeck
2021-03-07  2:26 ` Naresh Kamboju
2021-03-12 20:24 ` Florian Fainelli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210305120851.935800288@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pmladek@suse.com \
    --cc=sergey.senozhatsky@gmail.com \
    --cc=songmuchun@bytedance.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.