From: Kai-Heng Feng <email@example.com>
To: Heiner Kallweit <firstname.lastname@example.org>
Cc: nic_swsd <email@example.com>,
Bjorn Helgaas <firstname.lastname@example.org>,
David Miller <email@example.com>,
Jakub Kicinski <firstname.lastname@example.org>,
Anthony Wong <email@example.com>,
Linux Netdev List <firstname.lastname@example.org>,
Linux PCI <email@example.com>,
Bjorn Helgaas <firstname.lastname@example.org>
Subject: Re: [RFC] [PATCH net-next v4] [PATCH 2/2] r8169: Implement dynamic ASPM mechanism
Date: Tue, 7 Sep 2021 12:58:22 +0800 [thread overview]
Message-ID: <CAAd53p4WJkO2FEjfRdvCgkeQVzYr=JQPKDbPNrRuK8RYKmzC5A@mail.gmail.com> (raw)
On Tue, Sep 7, 2021 at 12:11 AM Heiner Kallweit <email@example.com> wrote:
> On 06.09.2021 17:10, Kai-Heng Feng wrote:
> > On Sat, Sep 4, 2021 at 4:00 AM Heiner Kallweit <firstname.lastname@example.org> wrote:
> >> On 03.09.2021 17:56, Kai-Heng Feng wrote:
> >>> On Tue, Aug 31, 2021 at 2:09 AM Bjorn Helgaas <email@example.com> wrote:
> >>>> On Sat, Aug 28, 2021 at 01:14:52AM +0800, Kai-Heng Feng wrote:
> >>>>> r8169 NICs on some platforms have abysmal speed when ASPM is enabled.
> >>>>> Same issue can be observed with older vendor drivers.
> >>>>> The issue is however solved by the latest vendor driver. There's a new
> >>>>> mechanism, which disables r8169's internal ASPM when the NIC traffic has
> >>>>> more than 10 packets, and vice versa. The possible reason for this is
> >>>>> likely because the buffer on the chip is too small for its ASPM exit
> >>>>> latency.
> >>>> This sounds like good speculation, but of course, it would be better
> >>>> to have the supporting data.
> >>>> You say above that this problem affects r8169 on "some platforms." I
> >>>> infer that ASPM works fine on other platforms. It would be extremely
> >>>> interesting to have some data on both classes, e.g., "lspci -vv"
> >>>> output for the entire system.
> >>> lspci data collected from working and non-working system can be found here:
> >>> https://bugzilla.kernel.org/show_bug.cgi?id=214307
> >>>> If r8169 ASPM works well on some systems, we *should* be able to make
> >>>> it work well on *all* systems, because the device can't tell what
> >>>> system it's in. All the device can see are the latencies for entry
> >>>> and exit for link states.
> >>> That's definitely better if we can make r8169 ASPM work for all platforms.
> >>>> IIUC this patch makes the driver wake up every 1000ms. If the NIC has
> >>>> sent or received more than 10 packets in the last 1000ms, it disables
> >>>> ASPM; otherwise it enables ASPM.
> >>> Yes, that's correct.
> >>>> I asked these same questions earlier, but nothing changed, so I won't
> >>>> raise them again if you don't think they're pertinent. Some patch
> >>>> splitting comments below.
> >>> Sorry about that. The lspci data is attached.
> >> Thanks for the additional details. I see that both systems have the L1
> >> sub-states active. Do you also face the issue if L1 is enabled but
> >> L1.2 and L1.2 are not? Setting the ASPM policy from powersupersave
> >> to powersave should be sufficient to disable them.
> >> I have a test system Asus PRIME H310I-PLUS, BIOS 2603 10/21/2019 with
> >> the same RTL8168h chip version. With L1 active and sub-states inactive
> >> everything is fine. With the sub-states activated I get few missed RX
> >> errors when running iperf3.
> > Once L1.1 and L1.2 are disabled the TX speed can reach 710Mbps and RX
> > can reach 941 Mbps. So yes it seems to be the same issue.
> I reach 940-950Mbps in both directions, but this seems to be unrelated
> to what we discuss here.
OK. Is there anything more I need to address in next iteration?
> > With dynamic ASPM, TX can reach 750 Mbps while ASPM L1.1 and L1.2 are enabled.
> >> One difference between your good and bad logs is the following.
> >> (My test system shows the same LTR value like your bad system.)
> >> Bad:
> >> Capabilities: [170 v1] Latency Tolerance Reporting
> >> Max snoop latency: 3145728ns
> >> Max no snoop latency: 3145728ns
> >> Good:
> >> Capabilities: [170 v1] Latency Tolerance Reporting
> >> Max snoop latency: 1048576ns
> >> Max no snoop latency: 1048576ns
> >> I have to admit that I'm not familiar with LTR and don't know whether
> >> this difference could contribute to the differing behavior.
> > I am also unsure what role LTR plays here, so I tried to change the
> > LTR value to 1048576ns and yield the same result, the TX and RX remain
> > very slow.
> > Kai-Heng
next prev parent reply other threads:[~2021-09-07 4:58 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-27 17:14 [RFC] [PATCH net-next v4 0/2] r8169: Implement dynamic ASPM mechanism for recent 1.0/2.5Gbps Realtek NICs Kai-Heng Feng
2021-08-27 17:14 ` [RFC] [PATCH net-next v4 1/2] PCI/ASPM: Introduce a new helper to report ASPM capability Kai-Heng Feng
2021-08-27 17:14 ` [RFC] [PATCH net-next v4] [PATCH 2/2] r8169: Implement dynamic ASPM mechanism Kai-Heng Feng
2021-08-30 18:09 ` Bjorn Helgaas
2021-09-03 15:56 ` Kai-Heng Feng
2021-09-03 20:00 ` Heiner Kallweit
2021-09-06 15:10 ` Kai-Heng Feng
2021-09-06 15:34 ` Heiner Kallweit
2021-09-07 4:58 ` Kai-Heng Feng [this message]
2021-09-07 6:03 ` Heiner Kallweit
2021-09-15 15:54 ` Kai-Heng Feng
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).