Linux-PCI Archive on lore.kernel.org
 help / color / Atom feed
From: Ian Kumlien <ian.kumlien@gmail.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: "Saheed O. Bolarinwa" <refactormyself@gmail.com>,
	Mike Marciniszyn <mike.marciniszyn@intel.com>,
	Dennis Dalessandro <dennis.dalessandro@intel.com>,
	Doug Ledford <dledford@redhat.com>,
	Jason Gunthorpe <jgg@ziepe.ca>,
	bjorn@helgaas.com, skhan@linuxfoundation.org,
	linux-kernel-mentees@lists.linuxfoundation.org,
	linux-pci@vger.kernel.org,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	linux-rdma@vger.kernel.org,
	"Michael J. Ruhl" <michael.j.ruhl@intel.com>,
	Ashutosh Dixit <ashutosh.dixit@intel.com>,
	Puranjay Mohan <puranjay12@gmail.com>
Subject: Re: [PATCH v4 01/12] IB/hfi1: Check if pcie_capability_read_*() reads ~0
Date: Mon, 3 Aug 2020 13:46:21 +0200
Message-ID: <CAA85sZt1_dwpxXMrjHc+5wEH-1w9X4km95UvCzEXqJ_o-OS6Hw@mail.gmail.com> (raw)
In-Reply-To: <20200731135523.GA3717@bjorn-Precision-5520>


[-- Attachment #1: Type: text/plain, Size: 6809 bytes --]

On Fri, Jul 31, 2020 at 3:55 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> [+cc Michael, Ashutosh, Ian, Puranjay]
>
> On Fri, Jul 31, 2020 at 01:02:29PM +0200, Saheed O. Bolarinwa wrote:
> > On failure pcie_capability_read_dword() sets it's last parameter,
> > val to 0. In this case dn and up will be 0, so aspm_hw_l1_supported()
> > will return false.
> > However, with Patch 12/12, it is possible that val is set to ~0 on
> > failure. This would introduce a bug because (x & x) == (~0 & x). So
> > with dn and up being 0x02, a true value is return when the read has
> > actually failed.
> >
> > Since, the value ~0 is invalid here,
> >
> > Reset dn and up to 0 when a value of ~0 is read into them, this
> > ensures false is returned on failure in this case.
> >
> > Suggested-by: Bjorn Helgaas <bjorn@helgaas.com>
> > Signed-off-by: Saheed O. Bolarinwa <refactormyself@gmail.com>
> > ---
> >
> >  drivers/infiniband/hw/hfi1/aspm.c | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/infiniband/hw/hfi1/aspm.c b/drivers/infiniband/hw/hfi1/aspm.c
> > index a3c53be4072c..9605b2145d19 100644
> > --- a/drivers/infiniband/hw/hfi1/aspm.c
> > +++ b/drivers/infiniband/hw/hfi1/aspm.c
> > @@ -33,13 +33,13 @@ static bool aspm_hw_l1_supported(struct hfi1_devdata *dd)
> >               return false;
> >
> >       pcie_capability_read_dword(dd->pcidev, PCI_EXP_LNKCAP, &dn);
> > -     dn = ASPM_L1_SUPPORTED(dn);
> > +     dn = (dn == (u32)~0) ? 0 : ASPM_L1_SUPPORTED(dn);
> >
> >       pcie_capability_read_dword(parent, PCI_EXP_LNKCAP, &up);
> > -     up = ASPM_L1_SUPPORTED(up);
> > +     up = (up == (u32)~0) ? 0 : ASPM_L1_SUPPORTED(up);
>
> I don't want to change this.  The driver shouldn't be mucking with
> ASPM at all.  The PCI core should take care of this automatically.  If
> it doesn't, we need to fix the core.
>
> If the driver needs to disable ASPM to work around device errata or
> something, the core has an interface for that.  But the driver should
> not override the system-wide policy for managing ASPM.
>
> Ah, some archaeology finds affa48de8417 ("staging/rdma/hfi1: Add
> support for enabling/disabling PCIe ASPM"), which says:
>
>   hfi1 HW has a high PCIe ASPM L1 exit latency and also advertises an
>   acceptable latency less than actual ASPM latencies.
>
> That suggests that either there is a device defect, e.g., advertising
> incorrect ASPM latencies, or a PCI core defect, e.g., incorrectly
> enabling ASPM when the path exit latency exceeds that hfi1 can
> tolerate.
>
> Coincidentally, Ian recently debugged a problem in how the PCI core
> computes exit latencies over a path [1].
>
> Can anybody supply details about the hfi1 ASPM parameters, e.g., the
> output of "sudo lspci -vv"?  Any details about the configuration where
> the problem occurs?  Is there a switch in the path?
>
> [1] https://lore.kernel.org/r/20200727213045.2117855-1-ian.kumlien@gmail.com
>
> >       /* ASPM works on A-step but is reported as not supported */
> > -     return (!!dn || is_ax(dd)) && !!up;
> > +     return (dn || is_ax(dd)) && up;
> >  }
> >
> >  /* Set L1 entrance latency for slower entry to L1 */
> > --
> > 2.18.4
> >

My experience with pcie is very limited, but the more I look at things
the more I get worried...

Anyway, I have made some changes, could you try the attached patch and
see if it makes a difference?

Changes:
L0s and L1 should only apply to links that actually has it enabled,
don't store or increase values if they don't.
Work on L0s as well, currently it clobbers since I'm not certain about
upstream/downstream distinctions.

diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index b17e5ffd31b1..0d93ae065f73 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -434,7 +434,8 @@ static void pcie_get_aspm_reg(struct pci_dev *pdev,

 static void pcie_aspm_check_latency(struct pci_dev *endpoint)
 {
-       u32 latency, l1_switch_latency = 0;
+       u32 latency, l1_max_latency = 0, l1_switch_latency = 0,
+               l0s_max_latency = 0;
        struct aspm_latency *acceptable;
        struct pcie_link_state *link;

@@ -447,15 +448,24 @@ static void pcie_aspm_check_latency(struct
pci_dev *endpoint)
        acceptable = &link->acceptable[PCI_FUNC(endpoint->devfn)];

        while (link) {
-               /* Check upstream direction L0s latency */
-               if ((link->aspm_capable & ASPM_STATE_L0S_UP) &&
-                   (link->latency_up.l0s > acceptable->l0s))
-                       link->aspm_capable &= ~ASPM_STATE_L0S_UP;
-
-               /* Check downstream direction L0s latency */
-               if ((link->aspm_capable & ASPM_STATE_L0S_DW) &&
-                   (link->latency_dw.l0s > acceptable->l0s))
-                       link->aspm_capable &= ~ASPM_STATE_L0S_DW;
+               if (link->aspm_capable & ASPM_STATE_L0S) {
+                       u32 l0s_up = 0, l0s_dw = 0;
+
+                       /* Check upstream direction L0s latency */
+                       if (link->aspm_capable & ASPM_STATE_L0S_UP)
+                               l0s_up = link->latency_up.l0s;
+
+                       /* Check downstream direction L0s latency */
+                       if (link->aspm_capable & ASPM_STATE_L0S_DW)
+                               l0s_dw = link->latency_dw.l0s;
+
+                       l0s_max_latency += max_t(u32, l0s_up, l0s_dw);
+
+                       /* If the latency exceeds, disable both */
+                       if (l0s_max_latency > acceptable->l0s)
+                               link->aspm_capable &= ~ASPM_STATE_L0S;
+               }
+
                /*
                 * Check L1 latency.
                 * Every switch on the path to root complex need 1
@@ -469,11 +479,13 @@ static void pcie_aspm_check_latency(struct
pci_dev *endpoint)
                 * L1 exit latencies advertised by a device include L1
                 * substate latencies (and hence do not do any check).
                 */
-               latency = max_t(u32, link->latency_up.l1, link->latency_dw.l1);
-               if ((link->aspm_capable & ASPM_STATE_L1) &&
-                   (latency + l1_switch_latency > acceptable->l1))
-                       link->aspm_capable &= ~ASPM_STATE_L1;
-               l1_switch_latency += 1000;
+               if (link->aspm_capable & ASPM_STATE_L1) {
+                       latency = max_t(u32, link->latency_up.l1,
link->latency_dw.l1);
+                       l1_max_latency = max_t(u32, latency, l1_max_latency);
+                       if (l1_max_latency + l1_switch_latency > acceptable->l1)
+                               link->aspm_capable &= ~ASPM_STATE_L1;
+                       l1_switch_latency += 1000;
+               }

                link = link->parent;
        }

[-- Attachment #2: 0001-Use-maximum-latency-when-determining-L1-L0s-ASPM.patch --]
[-- Type: text/x-patch, Size: 3431 bytes --]

From 971735bd32d7a8cb7cd1a8d4316fc2a2e192f8e2 Mon Sep 17 00:00:00 2001
From: Ian Kumlien <ian.kumlien@gmail.com>
Date: Sun, 26 Jul 2020 16:01:15 +0200
Subject: [PATCH] Use maximum latency when determining L1/L0s ASPM

Currently we check the maximum latency of upstream and downstream
per link, not the maximum for the path

This would work if all links have the same latency, but:
endpoint -> c -> b -> a -> root  (in the order we walk the path)

If c or b has the higest latency, it will not register

Fix this by maintaining the maximum latency value for the path

Also, L0s seems to be a cumulative maximum over the path, fix this as
well

This change fixes a regression introduced (but not caused) by:
66ff14e59e8a (PCI/ASPM: Allow ASPM on links to PCIe-to-PCI/PCI-X Bridges)

Signed-off-by: Ian Kumlien <ian.kumlien@gmail.com>
---
 drivers/pci/pcie/aspm.c | 42 ++++++++++++++++++++++++++---------------
 1 file changed, 27 insertions(+), 15 deletions(-)

diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index b17e5ffd31b1..0d93ae065f73 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -434,7 +434,8 @@ static void pcie_get_aspm_reg(struct pci_dev *pdev,
 
 static void pcie_aspm_check_latency(struct pci_dev *endpoint)
 {
-	u32 latency, l1_switch_latency = 0;
+	u32 latency, l1_max_latency = 0, l1_switch_latency = 0,
+		l0s_max_latency = 0;
 	struct aspm_latency *acceptable;
 	struct pcie_link_state *link;
 
@@ -447,15 +448,24 @@ static void pcie_aspm_check_latency(struct pci_dev *endpoint)
 	acceptable = &link->acceptable[PCI_FUNC(endpoint->devfn)];
 
 	while (link) {
-		/* Check upstream direction L0s latency */
-		if ((link->aspm_capable & ASPM_STATE_L0S_UP) &&
-		    (link->latency_up.l0s > acceptable->l0s))
-			link->aspm_capable &= ~ASPM_STATE_L0S_UP;
-
-		/* Check downstream direction L0s latency */
-		if ((link->aspm_capable & ASPM_STATE_L0S_DW) &&
-		    (link->latency_dw.l0s > acceptable->l0s))
-			link->aspm_capable &= ~ASPM_STATE_L0S_DW;
+		if (link->aspm_capable & ASPM_STATE_L0S) {
+			u32 l0s_up = 0, l0s_dw = 0;
+
+			/* Check upstream direction L0s latency */
+			if (link->aspm_capable & ASPM_STATE_L0S_UP)
+				l0s_up = link->latency_up.l0s;
+
+			/* Check downstream direction L0s latency */
+			if (link->aspm_capable & ASPM_STATE_L0S_DW)
+				l0s_dw = link->latency_dw.l0s;
+
+			l0s_max_latency += max_t(u32, l0s_up, l0s_dw);
+
+			/* If the latency exceeds, disable both */
+			if (l0s_max_latency > acceptable->l0s)
+				link->aspm_capable &= ~ASPM_STATE_L0S;
+		}
+
 		/*
 		 * Check L1 latency.
 		 * Every switch on the path to root complex need 1
@@ -469,11 +479,13 @@ static void pcie_aspm_check_latency(struct pci_dev *endpoint)
 		 * L1 exit latencies advertised by a device include L1
 		 * substate latencies (and hence do not do any check).
 		 */
-		latency = max_t(u32, link->latency_up.l1, link->latency_dw.l1);
-		if ((link->aspm_capable & ASPM_STATE_L1) &&
-		    (latency + l1_switch_latency > acceptable->l1))
-			link->aspm_capable &= ~ASPM_STATE_L1;
-		l1_switch_latency += 1000;
+		if (link->aspm_capable & ASPM_STATE_L1) {
+			latency = max_t(u32, link->latency_up.l1, link->latency_dw.l1);
+			l1_max_latency = max_t(u32, latency, l1_max_latency);
+			if (l1_max_latency + l1_switch_latency > acceptable->l1)
+				link->aspm_capable &= ~ASPM_STATE_L1;
+			l1_switch_latency += 1000;
+		}
 
 		link = link->parent;
 	}
-- 
2.28.0


  reply index

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-31 11:02 [PATCH v4 00/12] PCI: Remove '*val = 0' from pcie_capability_read_*() Saheed O. Bolarinwa
2020-07-31 11:02 ` [PATCH v4 01/12] IB/hfi1: Check if pcie_capability_read_*() reads ~0 Saheed O. Bolarinwa
2020-07-31 13:55   ` Bjorn Helgaas
2020-08-03 11:46     ` Ian Kumlien [this message]
2020-07-31 11:02 ` [PATCH v4 02/12] misc: rtsx: " Saheed O. Bolarinwa
2020-07-31 11:02 ` [PATCH v4 03/12] ath9k: " Saheed O. Bolarinwa
2020-07-31 11:02 ` [PATCH v4 04/12] iwlegacy: " Saheed O. Bolarinwa
2020-07-31 11:02 ` [PATCH v4 05/12] PCI: pciehp: Set "Power On" as the default get_power_status Saheed O. Bolarinwa
2020-07-31 11:02 ` [PATCH v4 06/12] PCI: pciehp: Check if pcie_capability_read_*() reads ~0 Saheed O. Bolarinwa
2020-07-31 11:02 ` [PATCH v4 07/12] PCI/ACPI: " Saheed O. Bolarinwa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAA85sZt1_dwpxXMrjHc+5wEH-1w9X4km95UvCzEXqJ_o-OS6Hw@mail.gmail.com \
    --to=ian.kumlien@gmail.com \
    --cc=ashutosh.dixit@intel.com \
    --cc=bjorn@helgaas.com \
    --cc=dennis.dalessandro@intel.com \
    --cc=dledford@redhat.com \
    --cc=helgaas@kernel.org \
    --cc=jgg@ziepe.ca \
    --cc=linux-kernel-mentees@lists.linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=michael.j.ruhl@intel.com \
    --cc=mike.marciniszyn@intel.com \
    --cc=puranjay12@gmail.com \
    --cc=refactormyself@gmail.com \
    --cc=skhan@linuxfoundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-PCI Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-pci/0 linux-pci/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-pci linux-pci/ https://lore.kernel.org/linux-pci \
		linux-pci@vger.kernel.org
	public-inbox-index linux-pci

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-pci


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git