linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ian Kumlien <ian.kumlien@gmail.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: linux-pci <linux-pci@vger.kernel.org>,
	Alexander Duyck <alexander.duyck@gmail.com>,
	"Saheed O. Bolarinwa" <refactormyself@gmail.com>,
	Puranjay Mohan <puranjay12@gmail.com>,
	Kai-Heng Feng <kai.heng.feng@canonical.com>
Subject: Re: [PATCH] Use maximum latency when determining L1 ASPM
Date: Mon, 12 Oct 2020 12:20:28 +0200	[thread overview]
Message-ID: <CAA85sZsxLZ5m9SNe=5RD9oA7FV0mdwEvGqnXkdtbp_-e_6G5LQ@mail.gmail.com> (raw)
In-Reply-To: <20201008161312.GA3261279@bjorn-Precision-5520>

On Thu, Oct 8, 2020 at 6:13 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Wed, Oct 07, 2020 at 03:28:08PM +0200, Ian Kumlien wrote:
> > Make pcie_aspm_check_latency comply with the PCIe spec, specifically:
> > "5.4.1.2.2. Exit from the L1 State"
> >
> > Which makes it clear that each switch is required to initiate a
> > transition within 1μs from receiving it, accumulating this latency and
> > then we have to wait for the slowest link along the path before
> > entering L0 state from L1.
> >
> > The current code doesn't take the maximum latency into account.
> >
> > From the example:
> >    +----------------+
> >    |                |
> >    |  Root complex  |
> >    |                |
> >    |    +-----+     |
> >    |    |32 μs|     |
> >    +----------------+
> >            |
> >            |  Link 1
> >            |
> >    +----------------+
> >    |     |8 μs|     |
> >    |     +----+     |
> >    |    Switch A    |
> >    |     +----+     |
> >    |     |8 μs|     |
> >    +----------------+
> >            |
> >            |  Link 2
> >            |
> >    +----------------+
> >    |    |32 μs|     |
> >    |    +-----+     |
> >    |    Switch B    |
> >    |    +-----+     |
> >    |    |32 μs|     |
> >    +----------------+
> >            |
> >            |  Link 3
> >            |
> >    +----------------+
> >    |     |8μs|      |
> >    |     +---+      |
> >    |   Endpoint C   |
> >    |                |
> >    |                |
> >    +----------------+
> >
> > Links 1, 2 and 3 are all in L1 state - endpoint C initiates the
> > transition to L0 at time T. Since switch B takes 32 μs to exit L1 on
> > it's ports, Link 3 will transition to L0 at T+32 (longest time
> > considering T+8 for endpoint C and T+32 for switch B).
> >
> > Switch B is required to initiate a transition from the L1 state on it's
> > upstream port after no more than 1 μs from the beginning of the
> > transition from L1 state on the downstream port. Therefore, transition from
> > L1 to L0 will begin on link 2 at T+1, this will cascade up the path.
> >
> > The path will exit L1 at T+34.
> >
> > On my specific system:
> > lspci -PP -s 04:00.0
> > 00:01.2/01:00.0/02:04.0/04:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. Device 816e (rev 1a)
> >
> > lspci -vvv -s 04:00.0
> >               DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
> > ...
> >               LnkCap: Port #0, Speed 5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s unlimited, L1 <64us
> > ...
> >
> > Which means that it can't be followed by any switch that is in L1 state.
> >
> > This patch fixes it by disabling L1 on 02:04.0, 01:00.0 and 00:01.2.
> >
> >                                                     LnkCtl    LnkCtl
> >            ------DevCap-------  ----LnkCap-------  -Before-  -After--
> >   00:01.2                                L1 <32us       L1+       L1-
> >   01:00.0                                L1 <32us       L1+       L1-
> >   02:04.0                                L1 <32us       L1+       L1-
> >   04:00.0  L0s <512 L1 <64us             L1 <64us       L1+       L1-
>
> OK, now we're getting close.  We just need to flesh out the
> justification.  We need:
>
>   - Tidy subject line.  Use "git log --oneline drivers/pci/pcie/aspm.c"
>     and follow the example.

Will do

>   - Description of the problem.  I think it's poor bandwidth on your
>     Intel I211 device, but we don't have the complete picture because
>     that NIC is 03:00.0, which doesn't appear above at all.

I think we'll use Kai-Hengs issue, since it's actually more related to
the change itself...

Mine is a side effect while Kai-Heng is actually hitting an issue
caused by the bug.

>   - Explanation of what's wrong with the "before" ASPM configuration.
>     I want to identify what is wrong on your system.  The generic
>     "doesn't match spec" part is good, but step 1 is the specific
>     details, step 2 is the generalization to relate it to the spec.
>
>   - Complete "sudo lspci -vv" information for before and after the
>     patch below.  https://bugzilla.kernel.org/show_bug.cgi?id=208741
>     has some of this, but some of the lspci output appears to be
>     copy/pasted and lost all its formatting, and it's not clear how
>     some was collected (what kernel version, with/without patch, etc).
>     Since I'm asking for bugzilla attachments, there's no space
>     constraint, so just attach the complete unedited output for the
>     whole system.
>
>   - URL to the bugzilla.  Please open a new one with just the relevant
>     problem report ("NIC is slow") and attach (1) "before" lspci
>     output, (2) proposed patch, (3) "after" lspci output.  The
>     existing 208741 report is full of distractions and jumps to the
>     conclusion without actually starting with the details of the
>     problem.
>
> Some of this I would normally just do myself, but I can't get the
> lspci info.  It would be really nice if Kai-Heng could also add
> before/after lspci output from the system he tested.
>
> > Signed-off-by: Ian Kumlien <ian.kumlien@gmail.com>
> > ---
> >  drivers/pci/pcie/aspm.c | 23 +++++++++++++++--------
> >  1 file changed, 15 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > index 253c30cc1967..893b37669087 100644
> > --- a/drivers/pci/pcie/aspm.c
> > +++ b/drivers/pci/pcie/aspm.c
> > @@ -434,7 +434,7 @@ static void pcie_get_aspm_reg(struct pci_dev *pdev,
> >
> >  static void pcie_aspm_check_latency(struct pci_dev *endpoint)
> >  {
> > -     u32 latency, l1_switch_latency = 0;
> > +     u32 latency, l1_max_latency = 0, l1_switch_latency = 0;
> >       struct aspm_latency *acceptable;
> >       struct pcie_link_state *link;
> >
> > @@ -456,10 +456,14 @@ static void pcie_aspm_check_latency(struct pci_dev *endpoint)
> >               if ((link->aspm_capable & ASPM_STATE_L0S_DW) &&
> >                   (link->latency_dw.l0s > acceptable->l0s))
> >                       link->aspm_capable &= ~ASPM_STATE_L0S_DW;
> > +
> >               /*
> >                * Check L1 latency.
> > -              * Every switch on the path to root complex need 1
> > -              * more microsecond for L1. Spec doesn't mention L0s.
> > +              *
> > +              * PCIe r5.0, sec 5.4.1.2.2 states:
> > +              * A Switch is required to initiate an L1 exit transition on its
> > +              * Upstream Port Link after no more than 1 μs from the beginning of an
> > +              * L1 exit transition on any of its Downstream Port Links.
> >                *
> >                * The exit latencies for L1 substates are not advertised
> >                * by a device.  Since the spec also doesn't mention a way
> > @@ -469,11 +473,14 @@ static void pcie_aspm_check_latency(struct pci_dev *endpoint)
> >                * L1 exit latencies advertised by a device include L1
> >                * substate latencies (and hence do not do any check).
> >                */
> > -             latency = max_t(u32, link->latency_up.l1, link->latency_dw.l1);
> > -             if ((link->aspm_capable & ASPM_STATE_L1) &&
> > -                 (latency + l1_switch_latency > acceptable->l1))
> > -                     link->aspm_capable &= ~ASPM_STATE_L1;
> > -             l1_switch_latency += 1000;
> > +             if (link->aspm_capable & ASPM_STATE_L1) {
> > +                     latency = max_t(u32, link->latency_up.l1, link->latency_dw.l1);
> > +                     l1_max_latency = max_t(u32, latency, l1_max_latency);
> > +                     if (l1_max_latency + l1_switch_latency > acceptable->l1)
> > +                             link->aspm_capable &= ~ASPM_STATE_L1;
> > +
> > +                     l1_switch_latency += 1000;
> > +             }
> >
> >               link = link->parent;
> >       }
> > --
> > 2.28.0
> >

  reply	other threads:[~2020-10-12 10:20 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-07 13:28 [PATCH] Use maximum latency when determining L1 ASPM Ian Kumlien
2020-10-08  4:20 ` Kai-Heng Feng
2020-10-08 16:13 ` Bjorn Helgaas
2020-10-12 10:20   ` Ian Kumlien [this message]
2020-10-14  8:34     ` Kai-Heng Feng
2020-10-14 13:33       ` Ian Kumlien
2020-10-14 14:36         ` Bjorn Helgaas
2020-10-14 15:39           ` Ian Kumlien
2020-10-16 14:53             ` Ian Kumlien
2020-10-16 21:28         ` Bjorn Helgaas
2020-10-16 22:41           ` Ian Kumlien
2020-10-18 11:35             ` Ian Kumlien
2020-10-22 15:37               ` Bjorn Helgaas
2020-10-22 15:41                 ` Ian Kumlien
2020-10-22 18:30                   ` Bjorn Helgaas
2020-10-24 20:55                     ` [PATCH 1/3] PCI/ASPM: Use the path max in L1 ASPM latency check Ian Kumlien
2020-10-24 20:55                       ` [PATCH 2/3] PCI/ASPM: Fix L0s max " Ian Kumlien
2020-11-15 21:49                         ` Ian Kumlien
2020-10-24 20:55                       ` [PATCH 3/3] [RFC] PCI/ASPM: Print L1/L0s latency messages per endpoint Ian Kumlien
2020-11-15 21:49                       ` [PATCH 1/3] PCI/ASPM: Use the path max in L1 ASPM latency check Ian Kumlien
2020-12-07 11:04                         ` Ian Kumlien
2020-12-12 23:47                       ` Bjorn Helgaas
2020-12-13 21:39                         ` Ian Kumlien
2020-12-14  5:44                           ` Bjorn Helgaas
2020-12-14  9:14                             ` Ian Kumlien
2020-12-14 14:02                               ` Bjorn Helgaas
2020-12-14 15:47                                 ` Ian Kumlien
2020-12-14 19:19                                   ` Bjorn Helgaas
2020-12-14 22:56                                     ` Ian Kumlien
2020-12-15  0:40                                       ` Bjorn Helgaas
2020-12-15 13:09                                         ` Ian Kumlien
2020-12-16  0:08                                           ` Bjorn Helgaas
2020-12-16 11:20                                             ` Ian Kumlien
2020-12-16 23:21                                               ` Bjorn Helgaas
2020-12-17 23:37                                                 ` Ian Kumlien
2021-01-12 20:42                       ` Bjorn Helgaas
2021-01-28 12:41                         ` Ian Kumlien
2021-02-24 22:19                           ` Ian Kumlien
2021-02-25 22:03                             ` Bjorn Helgaas
2021-04-26 14:36                               ` Ian Kumlien
2021-04-28 21:15                                 ` Bjorn Helgaas
2021-05-15 11:52                                   ` Ian Kumlien
  -- strict thread matches above, loose matches on Subject: below --
2020-07-27 21:30 [PATCH] Use maximum latency when determining L1 ASPM Ian Kumlien
2020-07-29 22:27 ` Bjorn Helgaas
2020-07-29 22:43   ` Ian Kumlien
2020-07-26 22:06 Ian Kumlien
2020-07-26 22:06 ` Ian Kumlien
2020-07-27 21:17   ` Ian Kumlien

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAA85sZsxLZ5m9SNe=5RD9oA7FV0mdwEvGqnXkdtbp_-e_6G5LQ@mail.gmail.com' \
    --to=ian.kumlien@gmail.com \
    --cc=alexander.duyck@gmail.com \
    --cc=helgaas@kernel.org \
    --cc=kai.heng.feng@canonical.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=puranjay12@gmail.com \
    --cc=refactormyself@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).