All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ian Kumlien <ian.kumlien@gmail.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: Kai-Heng Feng <kai.heng.feng@canonical.com>,
	linux-pci <linux-pci@vger.kernel.org>,
	Alexander Duyck <alexander.duyck@gmail.com>,
	"Saheed O. Bolarinwa" <refactormyself@gmail.com>,
	Puranjay Mohan <puranjay12@gmail.com>
Subject: Re: [PATCH 1/3] PCI/ASPM: Use the path max in L1 ASPM latency check
Date: Wed, 24 Feb 2021 23:19:55 +0100	[thread overview]
Message-ID: <CAA85sZuSZck+mTnCTkGikuxQpmNyiShmrbhUUtv91rZARL5Jsw@mail.gmail.com> (raw)
In-Reply-To: <CAA85sZuANe2+-c38LezjeMAjNve9Cj_zamSBe5mTiB+HXX0fdw@mail.gmail.com>

On Thu, Jan 28, 2021 at 1:41 PM Ian Kumlien <ian.kumlien@gmail.com> wrote:
>
> Sorry about the late reply, been trying to figure out what goes wrong
> since this email...
>
> And yes, I think you're right - the fact that it fixed my system was
> basically too good to be true =)

So, finally had some time to look at this again...

I played some with:
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index ac0557a305af..fdf252eee206 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -392,13 +392,13 @@ static void pcie_aspm_check_latency(struct
pci_dev *endpoint)

        while (link) {
                /* Check upstream direction L0s latency */
-               if ((link->aspm_capable & ASPM_STATE_L0S_UP) &&
-                   (link->latency_up.l0s > acceptable->l0s))
+               if ((link->aspm_capable & ASPM_STATE_L0S_UP) /* &&
+                   (link->latency_up.l0s > acceptable->l0s)*/)
                        link->aspm_capable &= ~ASPM_STATE_L0S_UP;

                /* Check downstream direction L0s latency */
-               if ((link->aspm_capable & ASPM_STATE_L0S_DW) &&
-                   (link->latency_dw.l0s > acceptable->l0s))
+               if ((link->aspm_capable & ASPM_STATE_L0S_DW) /* &&
+                   (link->latency_dw.l0s > acceptable->l0s)*/)
                        link->aspm_capable &= ~ASPM_STATE_L0S_DW;
                /*
                 * Check L1 latency.
---

Which does perform better but doesn't solve all the issues...

Home machine:
Latency:       3.364 ms
Download:    640.170 Mbit/s
Upload:      918.865 Mbit/s

My test server:
Latency:       4.549 ms
Download:    945.137 Mbit/s
Upload:      957.848 Mbit/s

But iperf3 still gets bogged down...
[  5]   0.00-1.00   sec  4.66 MBytes  39.0 Mbits/sec    0   82.0 KBytes
[  5]   1.00-2.00   sec  4.60 MBytes  38.6 Mbits/sec    0   79.2 KBytes
[  5]   2.00-3.00   sec  4.47 MBytes  37.5 Mbits/sec    0   56.6 KBytes

And with L1 ASPM disabled as usual:
[  5]   0.00-1.00   sec   112 MBytes   938 Mbits/sec  439    911 KBytes
[  5]   1.00-2.00   sec   109 MBytes   912 Mbits/sec  492    888 KBytes
[  5]   2.00-3.00   sec   110 MBytes   923 Mbits/sec  370   1.07 MBytes

And just for reference, bredbandskollen again with L1 ASPM disabled:
Latency:       2.281 ms
Download:    742.136 Mbit/s
Upload:      938.053 Mbit/s

Anyway, we started to look at the PCIe bridges etc, but i think it's
the network card
that is at fault, either with advertised latencies (should be lower)
or some bug since
other cards and peripherals connected to the system works just fine...

So, L0s actually seems to have somewhat of an impact - which I found
surprising sice
both machines are ~6 hops away - however latency differs (measured with tcp)

Can we collect L1 ASPM latency numbers for:
Ethernet controller: Intel Corporation I211 Gigabit Network Connection (rev 03)

> On Tue, Jan 12, 2021 at 9:42 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > My guess is the real problem is the Switch is advertising incorrect
> > exit latencies.  If the Switch advertised "<64us" exit latency for its
> > Upstream Port, we'd compute "64us exit latency + 1us Switch delay =
> > 65us", which is more than either 03:00.0 or 04:00.0 can tolerate, so
> > we would disable L1 on that upstream Link.
> >
> > Working around this would require some sort of quirk to override the
> > values read from the Switch, which is a little messy.  Here's what I'm
> > thinking (not asking you to do this; just trying to think of an
> > approach):
>
> The question is if it's the device or if it's the bridge...
>
> Ie, if the device can't quite handle it or if the bridge/switch/port
> advertises the wrong latency
> I have a friend with a similar motherboard and he has the same latency
> values - but his kernel doesn't apply ASPM
>
> I also want to check L0s since it seems to be enabled...
>
> >   - Configure common clock earlier, in pci_configure_device(), because
> >     this changes the "read-only" L1 exit latencies in Link
> >     Capabilities.
> >
> >   - Cache Link Capabilities in the pci_dev.
> >
> >   - Add a quirk to override the cached values for the Switch based on
> >     Vendor/Device ID and probably DMI motherboard/BIOS info.
>
> I can't say and I actually think it depends on the actual culprit
> which we haven't quite identified yet...
>
> > Bjorn

  reply	other threads:[~2021-02-24 22:22 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-07 13:28 [PATCH] Use maximum latency when determining L1 ASPM Ian Kumlien
2020-10-08  4:20 ` Kai-Heng Feng
2020-10-08 16:13 ` Bjorn Helgaas
2020-10-12 10:20   ` Ian Kumlien
2020-10-14  8:34     ` Kai-Heng Feng
2020-10-14 13:33       ` Ian Kumlien
2020-10-14 14:36         ` Bjorn Helgaas
2020-10-14 15:39           ` Ian Kumlien
2020-10-16 14:53             ` Ian Kumlien
2020-10-16 21:28         ` Bjorn Helgaas
2020-10-16 22:41           ` Ian Kumlien
2020-10-18 11:35             ` Ian Kumlien
2020-10-22 15:37               ` Bjorn Helgaas
2020-10-22 15:41                 ` Ian Kumlien
2020-10-22 18:30                   ` Bjorn Helgaas
2020-10-24 20:55                     ` [PATCH 1/3] PCI/ASPM: Use the path max in L1 ASPM latency check Ian Kumlien
2020-10-24 20:55                       ` [PATCH 2/3] PCI/ASPM: Fix L0s max " Ian Kumlien
2020-11-15 21:49                         ` Ian Kumlien
2020-10-24 20:55                       ` [PATCH 3/3] [RFC] PCI/ASPM: Print L1/L0s latency messages per endpoint Ian Kumlien
2020-11-15 21:49                       ` [PATCH 1/3] PCI/ASPM: Use the path max in L1 ASPM latency check Ian Kumlien
2020-12-07 11:04                         ` Ian Kumlien
2020-12-12 23:47                       ` Bjorn Helgaas
2020-12-13 21:39                         ` Ian Kumlien
2020-12-14  5:44                           ` Bjorn Helgaas
2020-12-14  5:44                             ` [Intel-wired-lan] " Bjorn Helgaas
2020-12-14  9:14                             ` Ian Kumlien
2020-12-14  9:14                               ` [Intel-wired-lan] " Ian Kumlien
2020-12-14 14:02                               ` Bjorn Helgaas
2020-12-14 14:02                                 ` [Intel-wired-lan] " Bjorn Helgaas
2020-12-14 15:47                                 ` Ian Kumlien
2020-12-14 15:47                                   ` [Intel-wired-lan] " Ian Kumlien
2020-12-14 19:19                                   ` Bjorn Helgaas
2020-12-14 19:19                                     ` [Intel-wired-lan] " Bjorn Helgaas
2020-12-14 22:56                                     ` Ian Kumlien
2020-12-14 22:56                                       ` [Intel-wired-lan] " Ian Kumlien
2020-12-15  0:40                                       ` Bjorn Helgaas
2020-12-15  0:40                                         ` [Intel-wired-lan] " Bjorn Helgaas
2020-12-15 13:09                                         ` Ian Kumlien
2020-12-15 13:09                                           ` [Intel-wired-lan] " Ian Kumlien
2020-12-16  0:08                                           ` Bjorn Helgaas
2020-12-16  0:08                                             ` [Intel-wired-lan] " Bjorn Helgaas
2020-12-16 11:20                                             ` Ian Kumlien
2020-12-16 11:20                                               ` [Intel-wired-lan] " Ian Kumlien
2020-12-16 23:21                                               ` Bjorn Helgaas
2020-12-16 23:21                                                 ` [Intel-wired-lan] " Bjorn Helgaas
2020-12-17 23:37                                                 ` Ian Kumlien
2020-12-17 23:37                                                   ` [Intel-wired-lan] " Ian Kumlien
2021-01-12 20:42                       ` Bjorn Helgaas
2021-01-28 12:41                         ` Ian Kumlien
2021-02-24 22:19                           ` Ian Kumlien [this message]
2021-02-25 22:03                             ` Bjorn Helgaas
2021-04-26 14:36                               ` Ian Kumlien
2021-04-28 21:15                                 ` Bjorn Helgaas
2021-05-15 11:52                                   ` Ian Kumlien

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAA85sZuSZck+mTnCTkGikuxQpmNyiShmrbhUUtv91rZARL5Jsw@mail.gmail.com \
    --to=ian.kumlien@gmail.com \
    --cc=alexander.duyck@gmail.com \
    --cc=helgaas@kernel.org \
    --cc=kai.heng.feng@canonical.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=puranjay12@gmail.com \
    --cc=refactormyself@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.