netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: RTL8169 question
       [not found]   ` <20191008102706.e3f57ffe3e779802898a99ee@skyboo.net>
@ 2019-10-08 20:04     ` Heiner Kallweit
  2019-10-09 12:20       ` Mariusz Bialonczyk
  0 siblings, 1 reply; 2+ messages in thread
From: Heiner Kallweit @ 2019-10-08 20:04 UTC (permalink / raw)
  To: Mariusz Bialonczyk; +Cc: netdev

On 08.10.2019 10:27, Mariusz Bialonczyk wrote:
> Heiner,
> Hello again.
> 
> On Wed, 25 Sep 2019 21:28:04 +0200
> Heiner Kallweit <hkallweit1@gmail.com> wrote:
> 
>>> Here is the brief description of the problem:
>>> After the NIC is going out of suspend-to-ram it is in some weird state.
>>> This state is causing problems on the cisco switch.
>>> If I understand the problem correctly my assumption is that the driver
>>> is probably waking up the NIC wrongly and this is leading to problems
>>> with cisco switch (which probably also have a bug in the firmware)... :)
>>>
>> The MTU setting doesn't affect L2 (link level), so it's not clear how this
>> should cause problems with the switch. Your RTL8168 chip version is ancient
>> and I can't rule out a HW issue with this chip version, so the best advice
>> I can give is: don't use jumbo packets (as chip also doesn't support
>> HW checksumming for jumbo packets).
> After hours spent of wireshark/tcpdump analysis I found the problem :)
> The RTL8168 chip or linux driver is malforming packets after wakeup from
> suspend-to-ram. I am able to catch a single packets which are starting with
> random src and dst MAC addresses (samples in attachment).
> The problem is when the MTU is set to 9000.
> These packets are single ones in the stream but it is affecting transmission.
> Do you think it would be easy catch and fix the problem in the linux driver?
> I have near 100% reproduce rate of the problem with 'stress' tool.

Thanks for the comprehensive analysis! Comparing the chip registers before
and after suspend your BIOS seems to change registers on resume from suspend,
and the driver doesn't configure jumbo when resuming. This may explain the
issue. The combination jumbo + suspend + BIOS bug seems to be quite rare,
else I think we should have seen such a report years ago already.
Could you please check whether the following patch fixes the issue for you?

Heiner

diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index 74f81fe03..350b0d949 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -4146,6 +4146,14 @@ static void rtl_hw_jumbo_disable(struct rtl8169_private *tp)
 	rtl_lock_config_regs(tp);
 }
 
+static void rtl_jumbo_config(struct rtl8169_private *tp, int mtu)
+{
+	if (mtu > ETH_DATA_LEN)
+		rtl_hw_jumbo_enable(tp);
+	else
+		rtl_hw_jumbo_disable(tp);
+}
+
 DECLARE_RTL_COND(rtl_chipcmd_cond)
 {
 	return RTL_R8(tp, ChipCmd) & CmdReset;
@@ -4442,11 +4450,6 @@ static void rtl8168g_set_pause_thresholds(struct rtl8169_private *tp,
 static void rtl_hw_start_8168bb(struct rtl8169_private *tp)
 {
 	RTL_W8(tp, Config3, RTL_R8(tp, Config3) & ~Beacon_en);
-
-	if (tp->dev->mtu <= ETH_DATA_LEN) {
-		rtl_tx_performance_tweak(tp, PCI_EXP_DEVCTL_READRQ_4096B |
-					 PCI_EXP_DEVCTL_NOSNOOP_EN);
-	}
 }
 
 static void rtl_hw_start_8168bef(struct rtl8169_private *tp)
@@ -4462,9 +4465,6 @@ static void __rtl_hw_start_8168cp(struct rtl8169_private *tp)
 
 	RTL_W8(tp, Config3, RTL_R8(tp, Config3) & ~Beacon_en);
 
-	if (tp->dev->mtu <= ETH_DATA_LEN)
-		rtl_tx_performance_tweak(tp, PCI_EXP_DEVCTL_READRQ_4096B);
-
 	rtl_disable_clock_request(tp);
 }
 
@@ -4490,9 +4490,6 @@ static void rtl_hw_start_8168cp_2(struct rtl8169_private *tp)
 	rtl_set_def_aspm_entry_latency(tp);
 
 	RTL_W8(tp, Config3, RTL_R8(tp, Config3) & ~Beacon_en);
-
-	if (tp->dev->mtu <= ETH_DATA_LEN)
-		rtl_tx_performance_tweak(tp, PCI_EXP_DEVCTL_READRQ_4096B);
 }
 
 static void rtl_hw_start_8168cp_3(struct rtl8169_private *tp)
@@ -4503,9 +4500,6 @@ static void rtl_hw_start_8168cp_3(struct rtl8169_private *tp)
 
 	/* Magic. */
 	RTL_W8(tp, DBG_REG, 0x20);
-
-	if (tp->dev->mtu <= ETH_DATA_LEN)
-		rtl_tx_performance_tweak(tp, PCI_EXP_DEVCTL_READRQ_4096B);
 }
 
 static void rtl_hw_start_8168c_1(struct rtl8169_private *tp)
@@ -4611,9 +4605,6 @@ static void rtl_hw_start_8168e_1(struct rtl8169_private *tp)
 
 	rtl_ephy_init(tp, e_info_8168e_1);
 
-	if (tp->dev->mtu <= ETH_DATA_LEN)
-		rtl_tx_performance_tweak(tp, PCI_EXP_DEVCTL_READRQ_4096B);
-
 	rtl_disable_clock_request(tp);
 
 	/* Reset tx FIFO pointer */
@@ -4636,9 +4627,6 @@ static void rtl_hw_start_8168e_2(struct rtl8169_private *tp)
 
 	rtl_ephy_init(tp, e_info_8168e_2);
 
-	if (tp->dev->mtu <= ETH_DATA_LEN)
-		rtl_tx_performance_tweak(tp, PCI_EXP_DEVCTL_READRQ_4096B);
-
 	rtl_eri_write(tp, 0xc0, ERIAR_MASK_0011, 0x0000);
 	rtl_eri_write(tp, 0xb8, ERIAR_MASK_0011, 0x0000);
 	rtl_set_fifo_size(tp, 0x10, 0x10, 0x02, 0x06);
@@ -5485,6 +5473,8 @@ static void rtl_hw_start(struct  rtl8169_private *tp)
 	rtl_set_rx_tx_desc_registers(tp);
 	rtl_lock_config_regs(tp);
 
+	rtl_jumbo_config(tp, tp->dev->mtu);
+
 	/* Initially a 10 us delay. Turned it into a PCI commit. - FR */
 	RTL_R16(tp, CPlusCmd);
 	RTL_W8(tp, ChipCmd, CmdTxEnb | CmdRxEnb);
@@ -5498,10 +5488,7 @@ static int rtl8169_change_mtu(struct net_device *dev, int new_mtu)
 {
 	struct rtl8169_private *tp = netdev_priv(dev);
 
-	if (new_mtu > ETH_DATA_LEN)
-		rtl_hw_jumbo_enable(tp);
-	else
-		rtl_hw_jumbo_disable(tp);
+	rtl_jumbo_config(tp, new_mtu);
 
 	dev->mtu = new_mtu;
 	netdev_update_features(dev);
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: RTL8169 question
  2019-10-08 20:04     ` RTL8169 question Heiner Kallweit
@ 2019-10-09 12:20       ` Mariusz Bialonczyk
  0 siblings, 0 replies; 2+ messages in thread
From: Mariusz Bialonczyk @ 2019-10-09 12:20 UTC (permalink / raw)
  To: Heiner Kallweit; +Cc: netdev, Francois Romieu

On Tue, 8 Oct 2019 22:04:45 +0200
Heiner Kallweit <hkallweit1@gmail.com> wrote:

> Thanks for the comprehensive analysis! Comparing the chip registers before
> and after suspend your BIOS seems to change registers on resume from suspend,
> and the driver doesn't configure jumbo when resuming. This may explain the
> issue. The combination jumbo + suspend + BIOS bug seems to be quite rare,
> else I think we should have seen such a report years ago already.
I see.

> Could you please check whether the following patch fixes the issue for you?
I've tested your patch with minor change when applying on top of the current
master tree.

The result is: it is working great!
No more problems, that I was observing before.
Thank you very much, Heiner!

You can add:
Reported-by: Mariusz Bialonczyk <manio@skyboo.net>
Tested-by: Mariusz Bialonczyk <manio@skyboo.net>

regards,
-- 
Mariusz Białończyk | xmpp/e-mail: manio@skyboo.net
http://manio.skyboo.net | https://github.com/manio

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2019-10-09 12:58 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20190913114424.540c1d257c4083eace242bbf@skyboo.net>
     [not found] ` <c55484e7-9dfb-0e5e-3887-278a334ac831@gmail.com>
     [not found]   ` <20191008102706.e3f57ffe3e779802898a99ee@skyboo.net>
2019-10-08 20:04     ` RTL8169 question Heiner Kallweit
2019-10-09 12:20       ` Mariusz Bialonczyk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).