All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: RENARD Pierre-Francois <pfrenard@gmail.com>,
	nsaenzjulienne@suse.de, woojung.huh@microchip.com,
	UNGLinuxDriver@microchip.com, netdev@vger.kernel.org,
	linux-usb@vger.kernel.org, stefan.wahren@i2se.com
Subject: Re: [RPI 3B+ / TSO / lan78xx ]
Date: Tue, 7 Jan 2020 09:21:06 -0800	[thread overview]
Message-ID: <a49c9cb2-576c-005b-580b-57ac8313d478@gmail.com> (raw)
In-Reply-To: <863777f2-3a7b-0736-d0a4-d9966bea3f96@gmail.com>



On 1/7/20 9:04 AM, Eric Dumazet wrote:
> 
> 
> On 1/7/20 5:32 AM, RENARD Pierre-Francois wrote:
>>
>> Hello all
>>
>> I am facing an issue related to Raspberry PI 3B+ and onboard ethernet card.
>>
>> When doing a huge transfer (more than 1GB) in a row, transfer hanges and failed after a few minutes.
>>
>>
>> I have two ways to reproduce this issue
>>
>>
>> using NFS (v3 or v4)
>>
>>     dd if=/dev/zero of=/NFSPATH/file bs=4M count=1000 status=progress
>>
>>
>>     we can see that at some point dd hangs and becomes non interrutible (no way to ctrl-c it or kill it)
>>
>>     after afew minutes, dd dies and a bunch of NFS server not responding / NFS server is OK are seens into the journal
>>
>>
>> Using SCP
>>
>>     dd if=/dev/zero of=/tmp/file bs=4M count=1000
>>
>>     scp /tmp/file user@server:/directory
>>
>>
>>     scp hangs after 1GB and after a few minutes scp is failing with message "client_loop: send disconnect: Broken pipe lostconnection"
>>
>>
>>
>>
>> It appears, this is a known bug relatted to TCP Segmentation Offload & Selective Acknowledge.
>>
>> disabling this TSO (ethtool -K eth0 tso off & ethtool -K eth0 gso off) solves the issue.
>>
>> A patch has been created to disable the feature by default by the raspberry team and is by default applied wihtin raspbian.
>>
>> comment from the patch :
>>
>> /* TSO seems to be having some issue with Selective Acknowledge (SACK) that
>>  * results in lost data never being retransmitted.
>>  * Disable it by default now, but adds a module parameter to enable it for
>>  * debug purposes (the full cause is not currently understood).
>>  */
>>
>>
>> For reference you can find
>>
>> a link to the issue I created yesterday : https://github.com/raspberrypi/linux/issues/3395
>>
>> links to raspberry dev team : https://github.com/raspberrypi/linux/issues/2482 & https://github.com/raspberrypi/linux/issues/2449
>>
>>
>>
>> If you need me to test things, or give you more informations, I ll be pleased to help.
>>
> 
> 
> I doubt TSO and SACK have a serious generic bug like that.
> 
> Most likely the TSO implementation on the driver/NIC has a bug .
> 
> Anyway you do not provide a kernel version, I am not sure what you expect from us.
> 

Oh well, drivers/net/usb/lan78xx.c is horribly buggy.

It wants linear skbs, which is likely to fail with too big packets.

And if skb linearization fails, skb is not freed, so a big memory leak happens.

Please try this patch :

diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c
index f940dc6485e56a7e8f905082ce920f5dd83232b0..5e2d3c8c34dc8d8ac6f2ab3fd8a59dba5b348882 100644
--- a/drivers/net/usb/lan78xx.c
+++ b/drivers/net/usb/lan78xx.c
@@ -2724,11 +2724,6 @@ static int lan78xx_stop(struct net_device *net)
        return 0;
 }
 
-static int lan78xx_linearize(struct sk_buff *skb)
-{
-       return skb_linearize(skb);
-}
-
 static struct sk_buff *lan78xx_tx_prep(struct lan78xx_net *dev,
                                       struct sk_buff *skb, gfp_t flags)
 {
@@ -2740,8 +2735,10 @@ static struct sk_buff *lan78xx_tx_prep(struct lan78xx_net *dev,
                return NULL;
        }
 
-       if (lan78xx_linearize(skb) < 0)
+       if (skb_linearize(skb)) {
+               dev_kfree_skb_any(skb);
                return NULL;
+       }
 
        tx_cmd_a = (u32)(skb->len & TX_CMD_A_LEN_MASK_) | TX_CMD_A_FCS_;
 
@@ -3790,6 +3787,9 @@ static int lan78xx_probe(struct usb_interface *intf,
        if (ret < 0)
                goto out4;
 
+       /* since we want linear skb, avoid high-order allocations */
+       netif_set_gso_max_size(netdev, SKB_WITH_OVERHEAD(16000));
+
        ret = register_netdev(netdev);
        if (ret != 0) {
                netif_err(dev, probe, netdev, "couldn't register the device\n");

  reply	other threads:[~2020-01-07 17:21 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <5267da21-8f12-2750-c0c5-4ed31b03833b@gmail.com>
2020-01-07 13:32 ` [RPI 3B+ / TSO / lan78xx ] RENARD Pierre-Francois
2020-01-07 17:04   ` Eric Dumazet
2020-01-07 17:21     ` Eric Dumazet [this message]
2020-01-07 17:30     ` Stefan Wahren
2020-01-07 18:06       ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a49c9cb2-576c-005b-580b-57ac8313d478@gmail.com \
    --to=eric.dumazet@gmail.com \
    --cc=UNGLinuxDriver@microchip.com \
    --cc=linux-usb@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=nsaenzjulienne@suse.de \
    --cc=pfrenard@gmail.com \
    --cc=stefan.wahren@i2se.com \
    --cc=woojung.huh@microchip.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.