All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Clayton <chris2553@googlemail.com>
To: Heiner Kallweit <hkallweit1@gmail.com>,
	"Maciej S. Szmigiero" <mail@maciej.szmigiero.name>
Cc: Azat Khuzhin <a3at.mail@gmail.com>,
	Realtek linux nic maintainers <nic_swsd@realtek.com>,
	linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: R8169: Network lockups in 4.18.{8,9,10} (and 4.19 dev)
Date: Wed, 10 Oct 2018 00:32:21 +0100	[thread overview]
Message-ID: <4e33341b-8805-80dd-26fb-1b1a4d2a3eb9@googlemail.com> (raw)
In-Reply-To: <e11f8eae-2bca-037b-9a7e-43c4276a2be6@gmail.com>



On 09/10/2018 22:39, Heiner Kallweit wrote:
> On 09.10.2018 16:40, Chris Clayton wrote:
>> Thanks to Maciej and Heiner for their replies.
>>
>> On 09/10/2018 13:32, Maciej S. Szmigiero wrote:
>>> On 07.10.2018 21:36, Chris Clayton wrote:
>>>> Hi again,
>>>>
>>>> I didn't think there was anything in 4.19-rc7 to fix this regression, but tried it anyway. I can confirm that the
>>>> regression is still present and my network still fails when, after a resume from suspend (to ram or disk), I open my
>>>> browser or my mail client. In both those cases the failure is almost immediate - e.g. my home page doesn't get displayed
>>>> in the browser. Pinging one of my ISPs name servers doesn't fail quite so quickly but the reported time increases from
>>>> 14-15ms to more than 1000ms.
>>>
>>> You can try comparing chip registers (ethtool -d eth0) in the working
>>> state (before a suspend) and in the broken state (after a resume).
>>> Maybe there will be some obvious in the difference.
>>>
>>> The same goes for the PCI configuration (lspci -d :8168 -vv).
>>>
>> Maciej suggested comparing the output from lspci -vv for the ethernet device. They are identical.
>>
>> Both Maciej and Heiner suggested comparing the output from "ethtool -d" pre and post suspend. Again, they are identical.
>> Heiner specifically suggested looking at the RxConfig. The value of that is 0x0002870e both pre and post suspend.
>>
>> I've attached files I redirected the outputs to.
>>
>> Please don't hesitate to ask for any other information needed to solve this problem. In the meantime, I've now got
>> scripts that stop the network during suspend and restart it during resume. (Those scripts were removed whilst I gathered
>> the diagnostics shown in the attachments.)
>>
> I'd like to check whether it may be a timing issue. The following experimental patch
> adds a PCI commit after writing register ChipCmd. Could you please check whether
> it changes anything?
> 
> diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
> index 7d3f671e1..f3c359492 100644
> --- a/drivers/net/ethernet/realtek/r8169.c
> +++ b/drivers/net/ethernet/realtek/r8169.c
> @@ -4641,6 +4641,7 @@ static void rtl_hw_start(struct  rtl8169_private *tp)
>  	/* Initially a 10 us delay. Turned it into a PCI commit. - FR */
>  	RTL_R8(tp, IntrMask);
>  	RTL_W8(tp, ChipCmd, CmdTxEnb | CmdRxEnb);
> +	RTL_R8(tp, ChipCmd);
>  	rtl_init_rxcfg(tp);
>  	rtl_set_tx_config_registers(tp);
>  
> 

Sorry, this patch doesn't make any difference - my network still fails. After a suspend/resume my browsers (chromium
and firefox) both fail to open my home page (https://www.google.co.uk). The ping time for one of my ISP's name servers
increases from 14-15ms to more than 1000ms, although it after a few pings it does reduce. As the screen grab below
shows, the network does eventually fail

$ ping NS1
PING ns1 (90.207.238.97): 56 data bytes
64 bytes from 90.207.238.97: icmp_seq=0 ttl=251 time=1017.289 ms
64 bytes from 90.207.238.97: icmp_seq=1 ttl=251 time=1018.051 ms
64 bytes from 90.207.238.97: icmp_seq=2 ttl=251 time=1015.271 ms
64 bytes from 90.207.238.97: icmp_seq=3 ttl=251 time=1015.495 ms
64 bytes from 90.207.238.97: icmp_seq=6 ttl=251 time=1015.646 ms
64 bytes from 90.207.238.97: icmp_seq=7 ttl=251 time=1022.609 ms
64 bytes from 90.207.238.97: icmp_seq=8 ttl=251 time=1015.612 ms
64 bytes from 90.207.238.97: icmp_seq=10 ttl=251 time=1015.551 ms
64 bytes from 90.207.238.97: icmp_seq=12 ttl=251 time=1015.446 ms
64 bytes from 90.207.238.97: icmp_seq=13 ttl=251 time=1015.657 ms
64 bytes from 90.207.238.97: icmp_seq=14 ttl=251 time=1015.614 ms
64 bytes from 90.207.238.97: icmp_seq=15 ttl=251 time=1015.651 ms
64 bytes from 90.207.238.97: icmp_seq=17 ttl=251 time=1015.459 ms
64 bytes from 90.207.238.97: icmp_seq=18 ttl=251 time=1015.443 ms
64 bytes from 90.207.238.97: icmp_seq=19 ttl=251 time=1015.936 ms
64 bytes from 90.207.238.97: icmp_seq=20 ttl=251 time=1015.681 ms
64 bytes from 90.207.238.97: icmp_seq=22 ttl=251 time=1015.410 ms
64 bytes from 90.207.238.97: icmp_seq=23 ttl=251 time=1015.487 ms
64 bytes from 90.207.238.97: icmp_seq=24 ttl=251 time=1016.169 ms
64 bytes from 90.207.238.97: icmp_seq=25 ttl=251 time=1015.659 ms
64 bytes from 90.207.238.97: icmp_seq=26 ttl=251 time=14.606 ms
64 bytes from 90.207.238.97: icmp_seq=30 ttl=251 time=32.765 ms
64 bytes from 90.207.238.97: icmp_seq=31 ttl=251 time=115.052 ms
64 bytes from 90.207.238.97: icmp_seq=33 ttl=251 time=757.115 ms
64 bytes from 90.207.238.97: icmp_seq=34 ttl=251 time=176.696 ms
64 bytes from 90.207.238.97: icmp_seq=35 ttl=251 time=1017.462 ms
64 bytes from 90.207.238.97: icmp_seq=36 ttl=251 time=16.394 ms
64 bytes from 90.207.238.97: icmp_seq=37 ttl=251 time=20.402 ms
64 bytes from 90.207.238.97: icmp_seq=38 ttl=251 time=37.795 ms
64 bytes from 90.207.238.97: icmp_seq=39 ttl=251 time=141.997 ms
92 bytes from laptop.local.lan (192.168.0.20): Destination Host Unreachable
92 bytes from laptop.local.lan (192.168.0.20): Destination Host Unreachable
...


Chris

      reply	other threads:[~2018-10-09 23:32 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-28 15:54 R8169: Network lockups in 4.18.{8,9,10} (and 4.19 dev) Maciej S. Szmigiero
2018-09-28 22:00 ` Chris Clayton
2018-09-28 22:13   ` Heiner Kallweit
2018-09-29  7:25     ` Chris Clayton
2018-09-29  7:38       ` Chris Clayton
2018-10-04  8:41     ` Chris Clayton
2018-10-07 19:36       ` Chris Clayton
2018-10-09 12:32         ` Maciej S. Szmigiero
2018-10-09 14:40           ` Chris Clayton
2018-10-09 20:36             ` Heiner Kallweit
2018-10-10  0:24               ` Maciej S. Szmigiero
2018-10-10  8:09                 ` Chris Clayton
2018-10-10  8:51                   ` Chris Clayton
2018-10-10 22:30                 ` Chris Clayton
2018-10-10 22:32                   ` Chris Clayton
2018-10-10 22:49                 ` Chris Clayton
2018-10-11  0:12                   ` Maciej S. Szmigiero
2018-10-11  8:24                     ` Chris Clayton
2018-10-11 12:23                       ` Maciej S. Szmigiero
2018-10-11 13:34                         ` Chris Clayton
2018-10-09 21:39             ` Heiner Kallweit
2018-10-09 23:32               ` Chris Clayton [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4e33341b-8805-80dd-26fb-1b1a4d2a3eb9@googlemail.com \
    --to=chris2553@googlemail.com \
    --cc=a3at.mail@gmail.com \
    --cc=hkallweit1@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mail@maciej.szmigiero.name \
    --cc=nic_swsd@realtek.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.