All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Thomas Jarosch <thomas.jarosch@intra2net.com>
Cc: 'Linux Netdev List' <netdev@vger.kernel.org>,
	Eric Dumazet <edumazet@google.com>,
	Jeff Kirsher <jeffrey.t.kirsher@intel.com>,
	e1000-devel <e1000-devel@lists.sourceforge.net>
Subject: Re: Re: Re: [bisected regression] e1000e: "Detected Hardware Unit Hang"
Date: Thu, 15 Jan 2015 08:00:58 -0800	[thread overview]
Message-ID: <1421337658.11734.76.camel@edumazet-glaptop2.roam.corp.google.com> (raw)
In-Reply-To: <3089325.gjrPpo2XX1@storm>

On Thu, 2015-01-15 at 16:48 +0100, Thomas Jarosch wrote:
> On Thursday, 15. January 2015 07:25:32 Eric Dumazet wrote:
> > On Thu, 2015-01-15 at 15:58 +0100, Thomas Jarosch wrote:
> > > A colleague mentioned to me he saw the "Hardware Unit Hang" message
> > > every
> > > few days even running on kernel 3.4 (without your patch). Basically I'm
> > > testing now if that's still the case with 3.19-rc4+ or not.
> > > 
> > > I'm all for fixing the root cause. I'm just interested if the e1000e
> > > hang can even be triggered when using a max frag page size of 4096.
> > > So far it transferred 751.6 GiB without a hiccup.
> > 
> > You told it was forwarding setup.
> > 
> > 1) What is the NIC receiving traffic.
> > 2) What happens if you disable GRO on it ?
> 
> The setup is like this:
> 
> Win7 notebook (client)
>     -> "private LAN" eth0 (e1000e)
>         -> "external traffic" eth1 (r8169)
> 
>             -> local HTTP server in the intranet
>                (2x e1000e using bonding)
> 
> 
> Disabling gro on eth1 (r8169) seems to make eth0 (e1000e) stable.
> As it usually hangs within seconds, it already transferred 28 GiB right now.
> 
> When I switch gro back on, it takes around three seconds until the hang.
> 
> Does that point into the right / any direction?

Sure. 

Please apply this patch, and try to lower
/proc/sys/net/core/gro_max_frags and see if this makes a difference
(leaving GRO enabled)

(start with 7 and increase it, limit being 17)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 642d426a668f8ac94daf334c00117f96789f3990..817aee05a1b0623e5752beb0952a6fe6d66e583f 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3400,6 +3400,7 @@ extern int		netdev_max_backlog;
 extern int		netdev_tstamp_prequeue;
 extern int		weight_p;
 extern int		bpf_jit_enable;
+extern int		sysctl_gro_max_frags;
 
 bool netdev_has_upper_dev(struct net_device *dev, struct net_device *upper_dev);
 struct net_device *netdev_upper_get_next_dev_rcu(struct net_device *dev,
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 56db472e9b864e805e0ab36dd73a0404d2fc66d5..c2c2e7e53014617c5da574f2eb8a2889ed743719 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3197,6 +3197,8 @@ err:
 }
 EXPORT_SYMBOL_GPL(skb_segment);
 
+int sysctl_gro_max_frags = MAX_SKB_FRAGS;
+
 int skb_gro_receive(struct sk_buff **head, struct sk_buff *skb)
 {
 	struct skb_shared_info *pinfo, *skbinfo = skb_shinfo(skb);
@@ -3219,8 +3221,8 @@ int skb_gro_receive(struct sk_buff **head, struct sk_buff *skb)
 		int i = skbinfo->nr_frags;
 		int nr_frags = pinfo->nr_frags + i;
 
-		if (nr_frags > MAX_SKB_FRAGS)
-			goto merge;
+		if (nr_frags > sysctl_gro_max_frags)
+			return -E2BIG;
 
 		offset -= headlen;
 		pinfo->nr_frags = nr_frags;
@@ -3252,8 +3254,8 @@ int skb_gro_receive(struct sk_buff **head, struct sk_buff *skb)
 		unsigned int first_size = headlen - offset;
 		unsigned int first_offset;
 
-		if (nr_frags + 1 + skbinfo->nr_frags > MAX_SKB_FRAGS)
-			goto merge;
+		if (nr_frags + 1 + skbinfo->nr_frags > sysctl_gro_max_frags)
+			return -E2BIG;
 
 		first_offset = skb->data -
 			       (unsigned char *)page_address(page) +
diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index 31baba2a71ce15e49450f69dae81e7d3be1ff3f2..de73d51381bf8acd0aedeb859ed961468441014a 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -278,6 +278,13 @@ static struct ctl_table net_core_table[] = {
 		.proc_handler	= proc_dointvec
 	},
 	{
+		.procname	= "gro_max_frags",
+		.data		= &sysctl_gro_max_frags,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec
+	},
+	{
 		.procname	= "netdev_rss_key",
 		.data		= &netdev_rss_key,
 		.maxlen		= sizeof(int),

  reply	other threads:[~2015-01-15 16:01 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-14 15:32 [bisected regression] e1000e: "Detected Hardware Unit Hang" Thomas Jarosch
2015-01-14 17:20 ` Eric Dumazet
2015-01-15 10:11   ` Thomas Jarosch
2015-01-15 14:43     ` Eric Dumazet
2015-01-15 14:58       ` Thomas Jarosch
2015-01-15 15:25         ` Eric Dumazet
2015-01-15 15:48           ` Thomas Jarosch
2015-01-15 16:00             ` Eric Dumazet [this message]
2015-01-15 17:04               ` Thomas Jarosch
2015-01-15 17:20                 ` Eric Dumazet
2015-01-15 17:37                   ` Thomas Jarosch
2015-01-15 18:24                     ` Re: Re: Re: " Eric Dumazet
2015-01-19 16:49           ` Thomas Jarosch
2015-01-15 14:59       ` Jeff Kirsher
2015-02-11 11:23         ` Thomas Jarosch
2015-02-11 11:34           ` Jeff Kirsher
2015-02-12 23:28             ` Brown, Aaron F
2015-02-13 16:14               ` Thomas Jarosch
2015-02-21  1:59                 ` Brown, Aaron F
2015-03-23 13:58                   ` Thomas Jarosch
2015-03-23 22:37                     ` Brown, Aaron F
2015-05-27 16:00                       ` Thomas Jarosch
2015-05-30  1:18                         ` Brown, Aaron F
2015-07-29  8:51                           ` Thomas Jarosch
2019-05-02 12:58                             ` Juliana Rodrigueiro
2015-02-12  1:18           ` nick

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1421337658.11734.76.camel@edumazet-glaptop2.roam.corp.google.com \
    --to=eric.dumazet@gmail.com \
    --cc=e1000-devel@lists.sourceforge.net \
    --cc=edumazet@google.com \
    --cc=jeffrey.t.kirsher@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=thomas.jarosch@intra2net.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.