From: Kai-Heng Feng <kai.heng.feng@canonical.com>
To: Ian Kumlien <ian.kumlien@gmail.com>
Cc: Bjorn Helgaas <helgaas@kernel.org>,
linux-pci <linux-pci@vger.kernel.org>,
Alexander Duyck <alexander.duyck@gmail.com>,
"Saheed O. Bolarinwa" <refactormyself@gmail.com>,
Puranjay Mohan <puranjay12@gmail.com>
Subject: Re: [PATCH] Use maximum latency when determining L1 ASPM
Date: Wed, 14 Oct 2020 16:34:17 +0800 [thread overview]
Message-ID: <0AD07E1E-02D1-4208-B90F-1949C85ECB64@canonical.com> (raw)
In-Reply-To: <CAA85sZsxLZ5m9SNe=5RD9oA7FV0mdwEvGqnXkdtbp_-e_6G5LQ@mail.gmail.com>
> On Oct 12, 2020, at 18:20, Ian Kumlien <ian.kumlien@gmail.com> wrote:
>
> On Thu, Oct 8, 2020 at 6:13 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>>
>> On Wed, Oct 07, 2020 at 03:28:08PM +0200, Ian Kumlien wrote:
>>> Make pcie_aspm_check_latency comply with the PCIe spec, specifically:
>>> "5.4.1.2.2. Exit from the L1 State"
>>>
>>> Which makes it clear that each switch is required to initiate a
>>> transition within 1μs from receiving it, accumulating this latency and
>>> then we have to wait for the slowest link along the path before
>>> entering L0 state from L1.
>>>
>>> The current code doesn't take the maximum latency into account.
>>>
>>> From the example:
>>> +----------------+
>>> | |
>>> | Root complex |
>>> | |
>>> | +-----+ |
>>> | |32 μs| |
>>> +----------------+
>>> |
>>> | Link 1
>>> |
>>> +----------------+
>>> | |8 μs| |
>>> | +----+ |
>>> | Switch A |
>>> | +----+ |
>>> | |8 μs| |
>>> +----------------+
>>> |
>>> | Link 2
>>> |
>>> +----------------+
>>> | |32 μs| |
>>> | +-----+ |
>>> | Switch B |
>>> | +-----+ |
>>> | |32 μs| |
>>> +----------------+
>>> |
>>> | Link 3
>>> |
>>> +----------------+
>>> | |8μs| |
>>> | +---+ |
>>> | Endpoint C |
>>> | |
>>> | |
>>> +----------------+
>>>
>>> Links 1, 2 and 3 are all in L1 state - endpoint C initiates the
>>> transition to L0 at time T. Since switch B takes 32 μs to exit L1 on
>>> it's ports, Link 3 will transition to L0 at T+32 (longest time
>>> considering T+8 for endpoint C and T+32 for switch B).
>>>
>>> Switch B is required to initiate a transition from the L1 state on it's
>>> upstream port after no more than 1 μs from the beginning of the
>>> transition from L1 state on the downstream port. Therefore, transition from
>>> L1 to L0 will begin on link 2 at T+1, this will cascade up the path.
>>>
>>> The path will exit L1 at T+34.
>>>
>>> On my specific system:
>>> lspci -PP -s 04:00.0
>>> 00:01.2/01:00.0/02:04.0/04:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. Device 816e (rev 1a)
>>>
>>> lspci -vvv -s 04:00.0
>>> DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
>>> ...
>>> LnkCap: Port #0, Speed 5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s unlimited, L1 <64us
>>> ...
>>>
>>> Which means that it can't be followed by any switch that is in L1 state.
>>>
>>> This patch fixes it by disabling L1 on 02:04.0, 01:00.0 and 00:01.2.
>>>
>>> LnkCtl LnkCtl
>>> ------DevCap------- ----LnkCap------- -Before- -After--
>>> 00:01.2 L1 <32us L1+ L1-
>>> 01:00.0 L1 <32us L1+ L1-
>>> 02:04.0 L1 <32us L1+ L1-
>>> 04:00.0 L0s <512 L1 <64us L1 <64us L1+ L1-
>>
>> OK, now we're getting close. We just need to flesh out the
>> justification. We need:
>>
>> - Tidy subject line. Use "git log --oneline drivers/pci/pcie/aspm.c"
>> and follow the example.
>
> Will do
>
>> - Description of the problem. I think it's poor bandwidth on your
>> Intel I211 device, but we don't have the complete picture because
>> that NIC is 03:00.0, which doesn't appear above at all.
>
> I think we'll use Kai-Hengs issue, since it's actually more related to
> the change itself...
>
> Mine is a side effect while Kai-Heng is actually hitting an issue
> caused by the bug.
I filed a bug here:
https://bugzilla.kernel.org/show_bug.cgi?id=209671
Kai-Heng
>
>> - Explanation of what's wrong with the "before" ASPM configuration.
>> I want to identify what is wrong on your system. The generic
>> "doesn't match spec" part is good, but step 1 is the specific
>> details, step 2 is the generalization to relate it to the spec.
>>
>> - Complete "sudo lspci -vv" information for before and after the
>> patch below. https://bugzilla.kernel.org/show_bug.cgi?id=208741
>> has some of this, but some of the lspci output appears to be
>> copy/pasted and lost all its formatting, and it's not clear how
>> some was collected (what kernel version, with/without patch, etc).
>> Since I'm asking for bugzilla attachments, there's no space
>> constraint, so just attach the complete unedited output for the
>> whole system.
>>
>> - URL to the bugzilla. Please open a new one with just the relevant
>> problem report ("NIC is slow") and attach (1) "before" lspci
>> output, (2) proposed patch, (3) "after" lspci output. The
>> existing 208741 report is full of distractions and jumps to the
>> conclusion without actually starting with the details of the
>> problem.
>>
>> Some of this I would normally just do myself, but I can't get the
>> lspci info. It would be really nice if Kai-Heng could also add
>> before/after lspci output from the system he tested.
>>
>>> Signed-off-by: Ian Kumlien <ian.kumlien@gmail.com>
>>> ---
>>> drivers/pci/pcie/aspm.c | 23 +++++++++++++++--------
>>> 1 file changed, 15 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
>>> index 253c30cc1967..893b37669087 100644
>>> --- a/drivers/pci/pcie/aspm.c
>>> +++ b/drivers/pci/pcie/aspm.c
>>> @@ -434,7 +434,7 @@ static void pcie_get_aspm_reg(struct pci_dev *pdev,
>>>
>>> static void pcie_aspm_check_latency(struct pci_dev *endpoint)
>>> {
>>> - u32 latency, l1_switch_latency = 0;
>>> + u32 latency, l1_max_latency = 0, l1_switch_latency = 0;
>>> struct aspm_latency *acceptable;
>>> struct pcie_link_state *link;
>>>
>>> @@ -456,10 +456,14 @@ static void pcie_aspm_check_latency(struct pci_dev *endpoint)
>>> if ((link->aspm_capable & ASPM_STATE_L0S_DW) &&
>>> (link->latency_dw.l0s > acceptable->l0s))
>>> link->aspm_capable &= ~ASPM_STATE_L0S_DW;
>>> +
>>> /*
>>> * Check L1 latency.
>>> - * Every switch on the path to root complex need 1
>>> - * more microsecond for L1. Spec doesn't mention L0s.
>>> + *
>>> + * PCIe r5.0, sec 5.4.1.2.2 states:
>>> + * A Switch is required to initiate an L1 exit transition on its
>>> + * Upstream Port Link after no more than 1 μs from the beginning of an
>>> + * L1 exit transition on any of its Downstream Port Links.
>>> *
>>> * The exit latencies for L1 substates are not advertised
>>> * by a device. Since the spec also doesn't mention a way
>>> @@ -469,11 +473,14 @@ static void pcie_aspm_check_latency(struct pci_dev *endpoint)
>>> * L1 exit latencies advertised by a device include L1
>>> * substate latencies (and hence do not do any check).
>>> */
>>> - latency = max_t(u32, link->latency_up.l1, link->latency_dw.l1);
>>> - if ((link->aspm_capable & ASPM_STATE_L1) &&
>>> - (latency + l1_switch_latency > acceptable->l1))
>>> - link->aspm_capable &= ~ASPM_STATE_L1;
>>> - l1_switch_latency += 1000;
>>> + if (link->aspm_capable & ASPM_STATE_L1) {
>>> + latency = max_t(u32, link->latency_up.l1, link->latency_dw.l1);
>>> + l1_max_latency = max_t(u32, latency, l1_max_latency);
>>> + if (l1_max_latency + l1_switch_latency > acceptable->l1)
>>> + link->aspm_capable &= ~ASPM_STATE_L1;
>>> +
>>> + l1_switch_latency += 1000;
>>> + }
>>>
>>> link = link->parent;
>>> }
>>> --
>>> 2.28.0
>>>
next prev parent reply other threads:[~2020-10-14 8:34 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-07 13:28 [PATCH] Use maximum latency when determining L1 ASPM Ian Kumlien
2020-10-08 4:20 ` Kai-Heng Feng
2020-10-08 16:13 ` Bjorn Helgaas
2020-10-12 10:20 ` Ian Kumlien
2020-10-14 8:34 ` Kai-Heng Feng [this message]
2020-10-14 13:33 ` Ian Kumlien
2020-10-14 14:36 ` Bjorn Helgaas
2020-10-14 15:39 ` Ian Kumlien
2020-10-16 14:53 ` Ian Kumlien
2020-10-16 21:28 ` Bjorn Helgaas
2020-10-16 22:41 ` Ian Kumlien
2020-10-18 11:35 ` Ian Kumlien
2020-10-22 15:37 ` Bjorn Helgaas
2020-10-22 15:41 ` Ian Kumlien
2020-10-22 18:30 ` Bjorn Helgaas
2020-10-24 20:55 ` [PATCH 1/3] PCI/ASPM: Use the path max in L1 ASPM latency check Ian Kumlien
2020-10-24 20:55 ` [PATCH 2/3] PCI/ASPM: Fix L0s max " Ian Kumlien
2020-11-15 21:49 ` Ian Kumlien
2020-10-24 20:55 ` [PATCH 3/3] [RFC] PCI/ASPM: Print L1/L0s latency messages per endpoint Ian Kumlien
2020-11-15 21:49 ` [PATCH 1/3] PCI/ASPM: Use the path max in L1 ASPM latency check Ian Kumlien
2020-12-07 11:04 ` Ian Kumlien
2020-12-12 23:47 ` Bjorn Helgaas
2020-12-13 21:39 ` Ian Kumlien
2020-12-14 5:44 ` Bjorn Helgaas
2020-12-14 9:14 ` Ian Kumlien
2020-12-14 14:02 ` Bjorn Helgaas
2020-12-14 15:47 ` Ian Kumlien
2020-12-14 19:19 ` Bjorn Helgaas
2020-12-14 22:56 ` Ian Kumlien
2020-12-15 0:40 ` Bjorn Helgaas
2020-12-15 13:09 ` Ian Kumlien
2020-12-16 0:08 ` Bjorn Helgaas
2020-12-16 11:20 ` Ian Kumlien
2020-12-16 23:21 ` Bjorn Helgaas
2020-12-17 23:37 ` Ian Kumlien
2021-01-12 20:42 ` Bjorn Helgaas
2021-01-28 12:41 ` Ian Kumlien
2021-02-24 22:19 ` Ian Kumlien
2021-02-25 22:03 ` Bjorn Helgaas
2021-04-26 14:36 ` Ian Kumlien
2021-04-28 21:15 ` Bjorn Helgaas
2021-05-15 11:52 ` Ian Kumlien
-- strict thread matches above, loose matches on Subject: below --
2020-07-27 21:30 [PATCH] Use maximum latency when determining L1 ASPM Ian Kumlien
2020-07-29 22:27 ` Bjorn Helgaas
2020-07-29 22:43 ` Ian Kumlien
2020-07-26 22:06 Ian Kumlien
2020-07-26 22:06 ` Ian Kumlien
2020-07-27 21:17 ` Ian Kumlien
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0AD07E1E-02D1-4208-B90F-1949C85ECB64@canonical.com \
--to=kai.heng.feng@canonical.com \
--cc=alexander.duyck@gmail.com \
--cc=helgaas@kernel.org \
--cc=ian.kumlien@gmail.com \
--cc=linux-pci@vger.kernel.org \
--cc=puranjay12@gmail.com \
--cc=refactormyself@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).