linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Keith Busch <kbusch@kernel.org>
To: Hinko Kocevar <hinko.kocevar@ess.eu>
Cc: "Kelley, Sean V" <sean.v.kelley@intel.com>,
	Linux PCI <linux-pci@vger.kernel.org>,
	Bjorn Helgaas <helgaas@kernel.org>
Subject: Re: [PATCHv2 0/5] aer handling fixups
Date: Mon, 11 Jan 2021 08:37:08 -0800	[thread overview]
Message-ID: <20210111163708.GA1458209@dhcp-10-100-145-180.wdc.com> (raw)
In-Reply-To: <c3117c51-144f-ae59-ad68-bdc5532d12cb@ess.eu>

On Mon, Jan 11, 2021 at 02:39:20PM +0100, Hinko Kocevar wrote:
> Testing this patch a bit more (without the 5/5) resulted in the same CPU
> lockup:
> 
> watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [irq/122-aerdrv:128]
> 
> as I initially reported with the 5/5 of this patch included.
> 
> It seems more infrequent, though. For example, after reboot this is not
> observed and the recovery process is successful, whereas when 5/5 is also
> used every recovery resulted in CPU lockup.

I am assuming this soft lockup is still when restoring the downstream
port's virtual channel capability. Your initial sighting indicates that
it doesn't appear to be a deadlock, but the stack trace never existed
pci_restore_vc_state() either. I did not find any obvious issues here
just from code inspection, so if you could try applying the following
patch and send the kernel messages output, that would help.

---
diff --git a/drivers/pci/vc.c b/drivers/pci/vc.c
index 5fc59ac31145..4834af7eb582 100644
--- a/drivers/pci/vc.c
+++ b/drivers/pci/vc.c
@@ -28,6 +28,7 @@ static void pci_vc_save_restore_dwords(struct pci_dev *dev, int pos,
 {
 	int i;
 
+	pci_warn(dev, "%s: pos:%d dwords:%d\n", __func__, pos, dwords);
 	for (i = 0; i < dwords; i++, buf++) {
 		if (save)
 			pci_read_config_dword(dev, pos + (i * 4), buf);
@@ -110,6 +111,8 @@ static void pci_vc_enable(struct pci_dev *dev, int pos, int res)
 	if (!pci_is_pcie(dev) || !pcie_downstream_port(dev))
 		return;
 
+	pci_warn(dev, "%s: pos:%d res:%d\n", __func__, pos, res);
+
 	ctrl_pos = pos + PCI_VC_RES_CTRL + (res * PCI_CAP_VC_PER_VC_SIZEOF);
 	status_pos = pos + PCI_VC_RES_STATUS + (res * PCI_CAP_VC_PER_VC_SIZEOF);
 
@@ -165,6 +168,8 @@ static void pci_vc_enable(struct pci_dev *dev, int pos, int res)
 	if (link && !pci_wait_for_pending(link, status_pos2,
 					  PCI_VC_RES_STATUS_NEGO))
 		pci_err(link, "VC%d negotiation stuck pending\n", id);
+
+	pci_warn(dev, "%s: pos:%d res:%d return\n", __func__, pos, res);
 }
 
 /**
@@ -190,6 +195,7 @@ static int pci_vc_do_save_buffer(struct pci_dev *dev, int pos,
 	int i, len = 0;
 	u8 *buf = save_state ? (u8 *)save_state->cap.data : NULL;
 
+	pci_warn(dev, "%s: buf:%d pos:%d\n", __func__, buf != NULL, pos);
 	/* Sanity check buffer size for save/restore */
 	if (buf && save_state->cap.size !=
 	    pci_vc_do_save_buffer(dev, pos, NULL, save)) {
@@ -278,6 +284,8 @@ static int pci_vc_do_save_buffer(struct pci_dev *dev, int pos,
 		pci_read_config_dword(dev, pos + PCI_VC_RES_CAP +
 				      (i * PCI_CAP_VC_PER_VC_SIZEOF), &cap);
 		parb_offset = ((cap & PCI_VC_RES_CAP_ARB_OFF) >> 24) * 16;
+		pci_warn(dev, "%s: i:%d evcc:%d parb_offset:%d\n", __func__, i,
+			 evcc, parb_offset);
 		if (parb_offset) {
 			int size, parb_phases = 0;
 
@@ -332,6 +340,7 @@ static int pci_vc_do_save_buffer(struct pci_dev *dev, int pos,
 		len += 4;
 	}
 
+	pci_warn(dev, "%s: len:%d\n", __func__, len);
 	return buf ? 0 : len;
 }
 
@@ -399,6 +408,7 @@ void pci_restore_vc_state(struct pci_dev *dev)
 		if (!save_state || !pos)
 			continue;
 
+		pci_warn(dev, "%s: i:%d pos:%d\n", __func__, i, pos);
 		pci_vc_do_save_buffer(dev, pos, save_state, false);
 	}
 }
--

  reply	other threads:[~2021-01-11 16:37 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-04 23:02 [PATCHv2 0/5] aer handling fixups Keith Busch
2021-01-04 23:02 ` [PATCHv2 1/5] PCI/ERR: Clear status of the reporting device Keith Busch
2021-01-04 23:02 ` [PATCHv2 2/5] PCI/AER: Actually get the root port Keith Busch
2021-01-04 23:02 ` [PATCHv2 3/5] PCI/ERR: Retain status from error notification Keith Busch
2021-03-03  5:34   ` Williams, Dan J
2021-03-03  5:46     ` Kuppuswamy, Sathyanarayanan
2021-03-04 20:01       ` Keith Busch
2021-03-04 22:11         ` Dan Williams
     [not found]           ` <23551edc-965c-21dc-0da8-a492c27c362d@intel.com>
2021-03-04 22:59             ` Dan Williams
2021-03-04 23:19               ` Kuppuswamy, Sathyanarayanan
2021-03-05  0:23                 ` Dan Williams
2021-03-05  0:54                   ` Keith Busch
2021-01-04 23:02 ` [PATCHv2 4/5] PCI/AER: Specify the type of port that was reset Keith Busch
2021-01-04 23:03 ` [PATCHv2 5/5] PCI/portdrv: Report reset for frozen channel Keith Busch
2021-01-05 14:21 ` [PATCHv2 0/5] aer handling fixups Hinko Kocevar
2021-01-05 15:06   ` Hinko Kocevar
2021-01-05 18:33     ` Keith Busch
2021-01-05 23:07       ` Kelley, Sean V
2021-01-07 21:42         ` Keith Busch
2021-01-08  9:38           ` Hinko Kocevar
2021-01-11 13:39             ` Hinko Kocevar
2021-01-11 16:37               ` Keith Busch [this message]
2021-01-11 20:02                 ` Hinko Kocevar
2021-01-11 22:09                   ` Keith Busch
     [not found]                     ` <ed8256dd-d70d-b8dc-fdc0-a78b9aa3bbd9@ess.eu>
2021-01-12 19:27                       ` Keith Busch
2021-01-12 22:19                         ` Hinko Kocevar
2021-01-12 23:17                           ` Keith Busch
2021-01-18  8:00                             ` Hinko Kocevar
2021-01-19 18:28                               ` Keith Busch
2021-02-03  0:03 ` Keith Busch
2021-02-04  8:35   ` Hinko Kocevar
2021-02-08 12:55 ` Hedi Berriche
2021-02-09 23:06 ` Bjorn Helgaas
2021-02-10  4:05   ` Keith Busch
2021-02-10 21:38     ` Bjorn Helgaas
2021-02-10  9:36 ` Yicong Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210111163708.GA1458209@dhcp-10-100-145-180.wdc.com \
    --to=kbusch@kernel.org \
    --cc=helgaas@kernel.org \
    --cc=hinko.kocevar@ess.eu \
    --cc=linux-pci@vger.kernel.org \
    --cc=sean.v.kelley@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).