All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mahesh Salgaonkar <mahesh@linux.ibm.com>
To: linuxppc-dev <linuxppc-dev@ozlabs.org>
Cc: Wen Xiong <wenxiong@linux.vnet.ibm.com>,
	Oliver O'Halloran <oohall@gmail.com>
Subject: [PATCH] pseries/eeh: fix the kdump kernel crash during eeh_pseries_init
Date: Mon, 20 Sep 2021 22:03:26 +0530	[thread overview]
Message-ID: <163215558252.413351.8600189949820258982.stgit@jupiter> (raw)

On pseries lpar when an empty slot is assigned to partition OR on single
lpar mode, kdump kernel crashes during issuing PHB reset. In the kdump
scenario, we traverse all PHBs and issue reset using the pe_config_addr of
first child device present under each PHB. However the code assumes that
none of the PHB slot can be empty and uses list_first_entry() to get first
child device under PHB. Since list_first_entry() expect list to be not
empty, it returns invalid pci_dn entry and ends up accessing NULL phb
pointer under pci_dn->phb causing kdump kernel crash.

This patch fixes the below kdump kernel crash by skipping the empty slot:

[    0.003655] audit: initializing netlink subsys (disabled)
[    0.003765] thermal_sys: Registered thermal governor 'fair_share'
[    0.003767] thermal_sys: Registered thermal governor 'step_wise'
[    0.003783] cpuidle: using governor menu
[    0.003977] pstore: Registered nvram as persistent store backend
[    0.004590] Issue PHB reset ...
[    0.004794] audit: type=2000 audit(1631267818.000:1): state=initialized audit_enabled=0 res=1
[    2.233957] BUG: Kernel NULL pointer dereference on read at 0x00000268
[    2.233966] Faulting instruction address: 0xc000000008101fb0
[    2.233972] Oops: Kernel access of bad area, sig: 7 [#1]
[    2.233977] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
[    2.233984] Modules linked in:
[    2.233989] CPU: 7 PID: 1 Comm: swapper/7 Not tainted 5.14.0 #1
[    2.233996] NIP:  c000000008101fb0 LR: c000000009284ccc CTR: c000000008029d70
[    2.234003] REGS: c00000001161b840 TRAP: 0300   Not tainted  (5.14.0)
[    2.234008] MSR:  8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE>  CR: 28000224  XER: 20040002
[    2.234022] CFAR: c000000008101f0c DAR: 0000000000000268 DSISR: 00080000 IRQMASK: 0
[    2.234022] GPR00: c000000009284ccc c00000001161bae0 c000000009c6d800 000000000000004d
[    2.234022] GPR04: 0000000000000004 0000000000000002 c00000001161bb4c 0000000000000000
[    2.234022] GPR08: 0000000000000000 0000000000000000 0000000000000001 c000000008e59a80
[    2.234022] GPR12: c000000008029d70 c000000009ff0400 c00000000801285c 0000000000000000
[    2.234022] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[    2.234022] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[    2.234022] GPR24: c00000000926338c c000000009248860 c0000000092f1048 c000000011079c00
[    2.234022] GPR28: c000000009785af8 c000000009d4b920 0000000000000000 0000000000000000
[    2.234091] NIP [c000000008101fb0] pseries_eeh_get_pe_config_addr+0x100/0x1b0
[    2.234100] LR [c000000009284ccc] __machine_initcall_pseries_eeh_pseries_init+0x2cc/0x350
[    2.234108] Call Trace:
[    2.234111] [c00000001161bae0] [c00000001161bb80] 0xc00000001161bb80 (unreliable)
[    2.234120] [c00000001161bb80] [c000000009284ccc] __machine_initcall_pseries_eeh_pseries_init+0x2cc/0x350
[    2.234128] [c00000001161bc00] [c000000008012210] do_one_initcall+0x60/0x2d0
[    2.234136] [c00000001161bcd0] [c000000009264990] kernel_init_freeable+0x350/0x3f8
[    2.234145] [c00000001161bda0] [c000000008012890] kernel_init+0x3c/0x17c
[    2.234151] [c00000001161be10] [c00000000800cdd4] ret_from_kernel_thread+0x5c/0x64
[    2.234159] Instruction dump:
[    2.234163] eba1ffe8 ebc1fff0 ebe1fff8 4e800020 7c0802a6 7ce33b78 39400001 7fe7fb78
[    2.234174] 38a00002 38800004 38c1006c f80100b0 <e91e0268> 79090020 79080022 4bf48edd
[    2.234187] ---[ end trace bee3ba4dca6761d3 ]---
[    2.235907]
[    3.235914] Kernel panic - not syncing: Fatal exception

Fixes: 5a090f7c363fd ("powerpc/pseries: PCIE PHB reset")
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
---
 arch/powerpc/platforms/pseries/eeh_pseries.c |    4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/eeh_pseries.c b/arch/powerpc/platforms/pseries/eeh_pseries.c
index bc15200852b7c..8780e7d33a0f5 100644
--- a/arch/powerpc/platforms/pseries/eeh_pseries.c
+++ b/arch/powerpc/platforms/pseries/eeh_pseries.c
@@ -867,6 +867,10 @@ static int __init eeh_pseries_init(void)
 	if (is_kdump_kernel() || reset_devices) {
 		pr_info("Issue PHB reset ...\n");
 		list_for_each_entry(phb, &hose_list, list_node) {
+			/* Skip the empty slot */
+			if (list_empty(&PCI_DN(phb->dn)->child_list))
+				continue;
+
 			pdn = list_first_entry(&PCI_DN(phb->dn)->child_list, struct pci_dn, list);
 			config_addr = pseries_eeh_get_pe_config_addr(pdn);
 



             reply	other threads:[~2021-09-20 16:34 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-20 16:33 Mahesh Salgaonkar [this message]
2021-10-08 13:23 ` [PATCH] pseries/eeh: fix the kdump kernel crash during eeh_pseries_init Michael Ellerman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=163215558252.413351.8600189949820258982.stgit@jupiter \
    --to=mahesh@linux.ibm.com \
    --cc=linuxppc-dev@ozlabs.org \
    --cc=oohall@gmail.com \
    --cc=wenxiong@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.