netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Heiner Kallweit <hkallweit1@gmail.com>
To: David Chang <dchang@suse.com>
Cc: Peter Ceiley <peter@ceiley.net>,
	Realtek linux nic maintainers <nic_swsd@realtek.com>,
	netdev@vger.kernel.org
Subject: Re: r8169 Driver - Poor Network Performance Since Kernel 4.19
Date: Thu, 31 Jan 2019 07:35:30 +0100	[thread overview]
Message-ID: <4d832c16-8830-b746-a818-6026c2e6725c@gmail.com> (raw)
In-Reply-To: <b8ed124a-fa79-e242-af03-4d24e4d6c56e@gmail.com>

Hi David, two more things:

1. Could you please test a recent linux-next kernel?
2. Please get a register dump (ethtool -d <if>) from 4.18 and 4.19
   and compare them.

Heiner


On 31.01.2019 07:21, Heiner Kallweit wrote:
> David, thanks for the link to the bug ticket.
> I think only a proper bisect can help to find the offending commit.
> 
> Heiner
> 
> 
> On 31.01.2019 03:32, David Chang wrote:
>> Hi,
>>
>> We had a similr case here.
>> - Realtek r8169 receive performance regression in kernel 4.19
>>   https://bugzilla.suse.com/show_bug.cgi?id=1119649
>>
>> kernel: r8169 0000:01:00.0 eth0: RTL8168h/8111h, XID 54100880
>> The major symptom is there are many rx_missed count.
>>
>>
>> On Jan 30, 2019 at 20:15:45 +0100, Heiner Kallweit wrote:
>>> Hi Peter,
>>>
>>> recently I had somebody where pcie_aspm=off for whatever reason didn't
>>> do the trick, can you also check with pcie_aspm.policy=performance.
>>
>> We will give it a try later.
>>
>>> And please check with "ethtool -S <if>" whether the chip statistics
>>> show a significant number of errors.
>>>
>>> If this doesn't help you may have to bisect to find the offending commit.
>>
>> We had tried fallback driver to a few previous commits as following,
>> but with no luck.
>>
>> 9675931e6b65 r8169: re-enable MSI-X on RTL8168g (v4.19)
>> 098b01ad9837 r8169: don't include asm headers directly (v4.19-rc1)
>> a2965f12fde6 r8169: remove rtl8169_set_speed_xmii (v4.19-rc1)
>> 6fcf9b1d4d6c r8169: fix runtime suspend (v4.19-rc1)
>> e397286b8e89 r8169: remove TBI 1000BaseX support (v4.19-rc1)
>>
>> Thanks,
>> David Chang
>>
>>>
>>> Heiner
>>>
>>>
>>> On 30.01.2019 10:59, Peter Ceiley wrote:
>>>> Hi Heiner,
>>>>
>>>> I tried disabling the ASPM using the pcie_aspm=off kernel parameter
>>>> and this made no difference.
>>>>
>>>> I tried compiling the 4.18.16 r8169.c with the 4.19.18 source and
>>>> subsequently loaded the module in the running 4.19.18 kernel. I can
>>>> confirm that this immediately resolved the issue and access to the NFS
>>>> shares operated as expected.
>>>>
>>>> I presume this means it is an issue with the r8169 driver included in
>>>> 4.19 onwards?
>>>>
>>>> To answer your last questions:
>>>>
>>>> Base Board Information
>>>>     Manufacturer: Alienware
>>>>     Product Name: 0PGRP5
>>>>     Version: A02
>>>>
>>>> ... and yes, the RTL8168 is the onboard network chip.
>>>>
>>>> Regards,
>>>>
>>>> Peter.
>>>>
>>>> On Tue, 29 Jan 2019 at 17:44, Heiner Kallweit <hkallweit1@gmail.com> wrote:
>>>>>
>>>>> Hi Peter,
>>>>>
>>>>> I think the vendor driver doesn't enable ASPM per default.
>>>>> So it's worth a try to disable ASPM in the BIOS or via sysfs.
>>>>> Few older systems seem to have issues with ASPM, what kind of
>>>>> system / mainboard are you using? The RTL8168 is the onboard
>>>>> network chip?
>>>>>
>>>>> Rgds, Heiner
>>>>>
>>>>>
>>>>> On 29.01.2019 07:20, Peter Ceiley wrote:
>>>>>> Hi Heiner,
>>>>>>
>>>>>> Thanks, I'll do some more testing. It might not be the driver - I
>>>>>> assumed it was due to the fact that using the r8168 driver 'resolves'
>>>>>> the issue. I'll see if I can test the r8169.c on top of 4.19 - this is
>>>>>> a good idea.
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> Peter.
>>>>>>
>>>>>> On Tue, 29 Jan 2019 at 17:16, Heiner Kallweit <hkallweit1@gmail.com> wrote:
>>>>>>>
>>>>>>> Hi Peter,
>>>>>>>
>>>>>>> at a first glance it doesn't look like a typical driver issue.
>>>>>>> What you could do:
>>>>>>>
>>>>>>> - Test the r8169.c from 4.18 on top of 4.19.
>>>>>>>
>>>>>>> - Check whether disabling ASPM (/sys/module/pcie_aspm) has an effect.
>>>>>>>
>>>>>>> - Bisect between 4.18 and 4.19 to find the offending commit.
>>>>>>>
>>>>>>> Any specific reason why you think root cause is in the driver and not
>>>>>>> elsewhere in the network subsystem?
>>>>>>>
>>>>>>> Heiner
>>>>>>>
>>>>>>>
>>>>>>> On 28.01.2019 23:10, Peter Ceiley wrote:
>>>>>>>> Hi Heiner,
>>>>>>>>
>>>>>>>> Thanks for getting back to me.
>>>>>>>>
>>>>>>>> No, I don't use jumbo packets.
>>>>>>>>
>>>>>>>> Bandwidth is *generally* good, and iperf results to my NAS provide
>>>>>>>> over 900 Mbits/s in both circumstances. The issue seems to appear when
>>>>>>>> establishing a connection and is most notable, for example, on my
>>>>>>>> mounted NFS shares where it takes seconds (up to 10's of seconds on
>>>>>>>> larger directories) to list the contents of each directory. Once a
>>>>>>>> transfer begins on a file, I appear to get good bandwidth.
>>>>>>>>
>>>>>>>> I'm unsure of the best scientific data to provide you in order to
>>>>>>>> troubleshoot this issue. Running the following
>>>>>>>>
>>>>>>>>     netstat -s |grep retransmitted
>>>>>>>>
>>>>>>>> shows a steady increase in retransmitted segments each time I list the
>>>>>>>> contents of a remote directory, for example, running 'ls' on a
>>>>>>>> directory containing 345 media files did the following using kernel
>>>>>>>> 4.19.18:
>>>>>>>>
>>>>>>>> increased retransmitted segments by 21 and the 'time' command showed
>>>>>>>> the following:
>>>>>>>>     real    0m19.867s
>>>>>>>>     user    0m0.012s
>>>>>>>>     sys    0m0.036s
>>>>>>>>
>>>>>>>> The same command shows no retransmitted segments running kernel
>>>>>>>> 4.18.16 and 'time' showed:
>>>>>>>>     real    0m0.300s
>>>>>>>>     user    0m0.004s
>>>>>>>>     sys    0m0.007s
>>>>>>>>
>>>>>>>> ifconfig does not show any RX/TX errors nor dropped packets in either case.
>>>>>>>>
>>>>>>>> dmesg XID:
>>>>>>>> [    2.979984] r8169 0000:03:00.0 eth0: RTL8168g/8111g,
>>>>>>>> f8:b1:56:fe:67:e0, XID 4c000800, IRQ 32
>>>>>>>>
>>>>>>>> # lspci -vv
>>>>>>>> 03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
>>>>>>>> RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0c)
>>>>>>>>     Subsystem: Dell RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
>>>>>>>>     Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
>>>>>>>> ParErr- Stepping- SERR- FastB2B- DisINTx+
>>>>>>>>     Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
>>>>>>>> <TAbort- <MAbort- >SERR- <PERR- INTx-
>>>>>>>>     Latency: 0, Cache Line Size: 64 bytes
>>>>>>>>     Interrupt: pin A routed to IRQ 19
>>>>>>>>     Region 0: I/O ports at d000 [size=256]
>>>>>>>>     Region 2: Memory at f7b00000 (64-bit, non-prefetchable) [size=4K]
>>>>>>>>     Region 4: Memory at f2100000 (64-bit, prefetchable) [size=16K]
>>>>>>>>     Capabilities: [40] Power Management version 3
>>>>>>>>         Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA
>>>>>>>> PME(D0+,D1+,D2+,D3hot+,D3cold+)
>>>>>>>>         Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
>>>>>>>>     Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
>>>>>>>>         Address: 0000000000000000  Data: 0000
>>>>>>>>     Capabilities: [70] Express (v2) Endpoint, MSI 01
>>>>>>>>         DevCap:    MaxPayload 128 bytes, PhantFunc 0, Latency L0s
>>>>>>>> <512ns, L1 <64us
>>>>>>>>             ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
>>>>>>>> SlotPowerLimit 10.000W
>>>>>>>>         DevCtl:    CorrErr- NonFatalErr- FatalErr- UnsupReq-
>>>>>>>>             RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
>>>>>>>>             MaxPayload 128 bytes, MaxReadReq 4096 bytes
>>>>>>>>         DevSta:    CorrErr+ NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
>>>>>>>>         LnkCap:    Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit
>>>>>>>> Latency L0s unlimited, L1 <64us
>>>>>>>>             ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
>>>>>>>>         LnkCtl:    ASPM L1 Enabled; RCB 64 bytes Disabled- CommClk+
>>>>>>>>             ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
>>>>>>>>         LnkSta:    Speed 2.5GT/s (ok), Width x1 (ok)
>>>>>>>>             TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
>>>>>>>>         DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR+,
>>>>>>>> OBFF Via message/WAKE#
>>>>>>>>              AtomicOpsCap: 32bit- 64bit- 128bitCAS-
>>>>>>>>         DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+,
>>>>>>>> OBFF Disabled
>>>>>>>>              AtomicOpsCtl: ReqEn-
>>>>>>>>         LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
>>>>>>>>              Transmit Margin: Normal Operating Range,
>>>>>>>> EnterModifiedCompliance- ComplianceSOS-
>>>>>>>>              Compliance De-emphasis: -6dB
>>>>>>>>         LnkSta2: Current De-emphasis Level: -6dB,
>>>>>>>> EqualizationComplete-, EqualizationPhase1-
>>>>>>>>              EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
>>>>>>>>     Capabilities: [b0] MSI-X: Enable+ Count=4 Masked-
>>>>>>>>         Vector table: BAR=4 offset=00000000
>>>>>>>>         PBA: BAR=4 offset=00000800
>>>>>>>>     Capabilities: [d0] Vital Product Data
>>>>>>>> pcilib: sysfs_read_vpd: read failed: Input/output error
>>>>>>>>         Not readable
>>>>>>>>     Capabilities: [100 v1] Advanced Error Reporting
>>>>>>>>         UESta:    DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
>>>>>>>> RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>>>>>>>>         UEMsk:    DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
>>>>>>>> RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>>>>>>>>         UESvrt:    DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt-
>>>>>>>> RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
>>>>>>>>         CESta:    RxErr+ BadTLP+ BadDLLP+ Rollover- Timeout+ AdvNonFatalErr-
>>>>>>>>         CEMsk:    RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
>>>>>>>>         AERCap:    First Error Pointer: 00, ECRCGenCap+ ECRCGenEn-
>>>>>>>> ECRCChkCap+ ECRCChkEn-
>>>>>>>>             MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
>>>>>>>>         HeaderLog: 00000000 00000000 00000000 00000000
>>>>>>>>     Capabilities: [140 v1] Virtual Channel
>>>>>>>>         Caps:    LPEVC=0 RefClk=100ns PATEntryBits=1
>>>>>>>>         Arb:    Fixed- WRR32- WRR64- WRR128-
>>>>>>>>         Ctrl:    ArbSelect=Fixed
>>>>>>>>         Status:    InProgress-
>>>>>>>>         VC0:    Caps:    PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
>>>>>>>>             Arb:    Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
>>>>>>>>             Ctrl:    Enable+ ID=0 ArbSelect=Fixed TC/VC=01
>>>>>>>>             Status:    NegoPending- InProgress-
>>>>>>>>     Capabilities: [160 v1] Device Serial Number 01-00-00-00-68-4c-e0-00
>>>>>>>>     Capabilities: [170 v1] Latency Tolerance Reporting
>>>>>>>>         Max snoop latency: 71680ns
>>>>>>>>         Max no snoop latency: 71680ns
>>>>>>>>     Kernel driver in use: r8169
>>>>>>>>     Kernel modules: r8169
>>>>>>>>
>>>>>>>> Please let me know if you have any other ideas in terms of testing.
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>>
>>>>>>>> Peter.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, 29 Jan 2019 at 05:28, Heiner Kallweit <hkallweit1@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> On 28.01.2019 12:13, Peter Ceiley wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I have been experiencing very poor network performance since Kernel
>>>>>>>>>> 4.19 and I'm confident it's related to the r8169 driver.
>>>>>>>>>>
>>>>>>>>>> I have no issue with kernel versions 4.18 and prior. I am experiencing
>>>>>>>>>> this issue in kernels 4.19 and 4.20 (currently running/testing with
>>>>>>>>>> 4.20.4 & 4.19.18).
>>>>>>>>>>
>>>>>>>>>> If someone could guide me in the right direction, I'm happy to help
>>>>>>>>>> troubleshoot this issue. Note that I have been keeping an eye on one
>>>>>>>>>> issue related to loading of the PHY driver, however, my symptoms
>>>>>>>>>> differ in that I still have a network connection. I have attempted to
>>>>>>>>>> reload the driver on a running system, but this does not improve the
>>>>>>>>>> situation.
>>>>>>>>>>
>>>>>>>>>> Using the proprietary r8168 driver returns my device to proper working order.
>>>>>>>>>>
>>>>>>>>>> lshw shows:
>>>>>>>>>>        description: Ethernet interface
>>>>>>>>>>        product: RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
>>>>>>>>>>        vendor: Realtek Semiconductor Co., Ltd.
>>>>>>>>>>        physical id: 0
>>>>>>>>>>        bus info: pci@0000:03:00.0
>>>>>>>>>>        logical name: enp3s0
>>>>>>>>>>        version: 0c
>>>>>>>>>>        serial:
>>>>>>>>>>        size: 1Gbit/s
>>>>>>>>>>        capacity: 1Gbit/s
>>>>>>>>>>        width: 64 bits
>>>>>>>>>>        clock: 33MHz
>>>>>>>>>>        capabilities: pm msi pciexpress msix vpd bus_master cap_list
>>>>>>>>>> ethernet physical tp aui bnc mii fibre 10bt 10bt-fd 100bt 100bt-fd
>>>>>>>>>> 1000bt-fd autonegotiation
>>>>>>>>>>        configuration: autonegotiation=on broadcast=yes driver=r8169
>>>>>>>>>> duplex=full firmware=rtl8168g-2_0.0.1 02/06/13 ip=192.168.1.25
>>>>>>>>>> latency=0 link=yes multicast=yes port=MII speed=1Gbit/s
>>>>>>>>>>        resources: irq:19 ioport:d000(size=256)
>>>>>>>>>> memory:f7b00000-f7b00fff memory:f2100000-f2103fff
>>>>>>>>>>
>>>>>>>>>> Kind Regards,
>>>>>>>>>>
>>>>>>>>>> Peter.
>>>>>>>>>>
>>>>>>>>> Hi Peter,
>>>>>>>>>
>>>>>>>>> the description "poor network performance" is quite vague, therefore:
>>>>>>>>>
>>>>>>>>> - Can you provide any measurements?
>>>>>>>>> - iperf results before and after
>>>>>>>>> - statistics about dropped packets (rx and/or tx)
>>>>>>>>> - Do you use jumbo packets?
>>>>>>>>>
>>>>>>>>> Also help would be a "lspci -vv" output for the network card and
>>>>>>>>> the dmesg output line with the chip XID.
>>>>>>>>>
>>>>>>>>> Heiner
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
> 


  reply	other threads:[~2019-01-31  6:43 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-28 11:13 r8169 Driver - Poor Network Performance Since Kernel 4.19 Peter Ceiley
2019-01-28 18:28 ` Heiner Kallweit
2019-01-28 22:10   ` Peter Ceiley
2019-01-29  6:16     ` Heiner Kallweit
2019-01-29  6:20       ` Peter Ceiley
2019-01-29  6:44         ` Heiner Kallweit
2019-01-30  9:59           ` Peter Ceiley
2019-01-30 19:15             ` Heiner Kallweit
2019-01-31  2:32               ` David Chang
2019-01-31  6:21                 ` Heiner Kallweit
2019-01-31  6:35                   ` Heiner Kallweit [this message]
2019-01-31  6:49                     ` Heiner Kallweit
2019-01-31  7:23                     ` David Chang
2019-01-31 12:09                       ` Peter Ceiley
2019-01-31 18:28                         ` Heiner Kallweit
2019-02-01  4:27                           ` David Chang
2019-02-01  4:29                     ` David Chang
2019-02-01  6:32                       ` Heiner Kallweit
2019-02-02 12:25                 ` Heiner Kallweit
2019-02-05 18:50                 ` Heiner Kallweit
2019-02-05 18:53                   ` Heiner Kallweit
2019-02-11  6:23                   ` David Chang
2019-02-14  2:45                   ` David Chang
2019-02-14  6:17                     ` Heiner Kallweit
2019-02-15  2:51                       ` David Chang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4d832c16-8830-b746-a818-6026c2e6725c@gmail.com \
    --to=hkallweit1@gmail.com \
    --cc=dchang@suse.com \
    --cc=netdev@vger.kernel.org \
    --cc=nic_swsd@realtek.com \
    --cc=peter@ceiley.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).