regressions.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Thorsten Leemhuis <regressions@leemhuis.info>
To: "regressions@lists.linux.dev" <regressions@lists.linux.dev>
Subject: Re: [REGRESSION] r8169: RTL8168h "transmit queue 0 timed out" after ASPM L1 enablement #forregzbot
Date: Mon, 4 Jul 2022 12:56:46 +0200	[thread overview]
Message-ID: <73d98a7c-6781-3729-89b4-b1498f919ae4@leemhuis.info> (raw)
In-Reply-To: <93000ee0-7c9b-c636-c21a-eaade2ba1f6c@leemhuis.info>

TWIMC: this mail is primarily send for documentation purposes and for
regzbot, my Linux kernel regression tracking bot. These mails usually
contain '#forregzbot' in the subject, to make them easy to spot and filter.

On 20.06.22 08:40, Thorsten Leemhuis wrote:
> On 08.06.22 09:30, Heiner Kallweit wrote:
>> On 08.06.2022 02:44, Bernhard Hampel-Waffenthal wrote:
>>> #regzbot introduced: 4b5f82f6aaef3fa95cce52deb8510f55ddda6a71
>>>
>>> since the last major kernel version upgrade to 5.18 on Arch Linux I'm unable to get a usable ethernet connection on my desktop PC.
>>>
>>> I can see a timeout in the logs
>>>
>>>> kernel: NETDEV WATCHDOG: enp37s0 (r8169): transmit queue 0 timed out
>>>
>>> and regular very likely related errors after
>>>
>>>> kernel: r8169 0000:25:00.0 enp37s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100).
>>>
>>> The link does manage to go up at nominal full 1Gbps speed, but there is no usable connection to speak of and pings are very bursty and take multiple seconds.
>>>
>>> I was able to pinpoint that the problems were introduced in commit 4b5f82f6aaef3fa95cce52deb8510f55ddda6a71 with the enablement of ASPM L1/L1.1 for ">= RTL_GIGA_MAC_VER_45", which my chip falls under. Adding pcie_aspm=off the kernel command line or changing that check to ">= RTL_GIGA_MAC_VER_60" for testing purposes and recompiling the kernel fixes my problems.
>>>
>>> I'm using a MSI B450I GAMING PLUS AC motherboard with a RTL8168h chip as per dmesg:
>>>
>>>> r8169 0000:25:00.0 eth0: RTL8168h/8111h, 30:9c:23:de:97:a9, XID 541, IRQ 101
>>>
>>> lspci says:
>>>
>>>> 25:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 15)
>>>         Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:7a40]
>>>         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
>>>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>>>         Latency: 0, Cache Line Size: 64 bytes
>>>         Interrupt: pin A routed to IRQ 30
>>>         IOMMU group: 14
>>>         Region 0: I/O ports at f000 [size=256]
>>>         Region 2: Memory at fcb04000 (64-bit, non-prefetchable) [size=4K]
>>>         Region 4: Memory at fcb00000 (64-bit, non-prefetchable) [size=16K]
>>>         Capabilities: <access denied>
>>>         Kernel driver in use: r8169
>>>         Kernel modules: r8169
>>>
>> Thanks for the report. On my test systems RTL8168h works fine with ASPM L1 and L1.1, so it seems to be
>> a board-specific issue.
> 
> Well, we already removed changes like the one causing this if things
> like ASPM cause regressions only for some users because their HW is
> flawky. But I'd prefer to avoid that myself.
> 
>> Some reports in the past indicated that changing IOMMU settings may help,
>> you can also use the ASPM sysfs link attributes to disable selected ASPM states for just this link.
> 
> Bernhard, did this or the suggestions from Hans help to solve the
> problem for you?
> 
> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> 
> P.S.: As the Linux kernel's regression tracker I deal with a lot of
> reports and sometimes miss something important when writing mails like
> this. If that's the case here, don't hesitate to tell me in a public
> reply, it's in everyone's interest to set the public record straight.

#regzbot invalid: reporter didn't reply, might be satisfied with the
help he got, and a tricky situation anyway

  reply	other threads:[~2022-07-04 10:56 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-08  0:44 [REGRESSION] r8169: RTL8168h "transmit queue 0 timed out" after ASPM L1 enablement Bernhard Hampel-Waffenthal
2022-06-08  7:30 ` Heiner Kallweit
2022-06-20  6:40   ` Thorsten Leemhuis
2022-07-04 10:56     ` Thorsten Leemhuis [this message]
2022-06-08 13:19 ` Hans de Goede

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=73d98a7c-6781-3729-89b4-b1498f919ae4@leemhuis.info \
    --to=regressions@leemhuis.info \
    --cc=regressions@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).