All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jesper Krogh <jesper-Q2TZfHgGEy4@public.gmane.org>
To: Jesse Brandeburg
	<jesse.brandeburg-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Jesse Brandeburg
	<jesse.brandeburg-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Subject: Re: ixgbe_clean_tx_irq: tx hang 1 detected, resetting adapter 	(2.6.32.8)
Date: Mon, 22 Feb 2010 19:40:20 +0100	[thread overview]
Message-ID: <4B82CF94.5070903@krogh.cc> (raw)
In-Reply-To: <4807377b1002171453n277cfea3s6d7f3629bd43f674-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

Jesse Brandeburg wrote:
> On Sun, Feb 14, 2010 at 8:29 AM, Jesper Krogh <jesper-Q2TZfHgGEy4@public.gmane.org> wrote:
>> Hi List.
>>
>> I have tried to get a dual bond of 2 x 10G NICs using the
>> Intel Corporation 82598EB 10-Gigabit AT2 Server Adapter (rev 01)
>> going. As first it looked like it "just worked" but when tried to fill
>> the links with data one of the NIC's (eth7) hang and did a reset of
>> itself, so all data was pushed through the other NIC in the bond (eth8)
>>
>> Full dmesg below, but I think the important part is this:
>>
>> [ 2162.745354] ixgbe: eth7: ixgbe_check_tx_hang: Detected Tx Unit Hang
>> [ 2162.745356]   Tx Queue             <4>
>> [ 2162.745356]   TDH, TDT             <e1>, <cc>
>> [ 2162.745357]   next_to_use          <cc>
>> [ 2162.745358]   next_to_clean        <e1>
>> [ 2162.745359] tx_buffer_info[next_to_clean]
>> [ 2162.745359]   time_stamp           <1000713d3>
>> [ 2162.745360]   jiffies              <10007152e>
>> [ 2163.162478] ixgbe: eth7: ixgbe_clean_tx_irq: tx hang 1 detected,
>> resetting adapter
>> [ 2163.357333] bonding: bond0: link status definitely down for interface
>> eth7, disabling it
>> [ 2168.670342] ixgbe: eth7 NIC Link is Up 10 Gbps, Flow Control: None
> 
> Hi Jesper, my first thought was flow control, but I can see you have it off.

I didnt change it, so its default.

> Can we get some more details on the hardware and bios version?  

Sun X4600, 128GB, 16 cores..

> What
> about some dmidecode output.  I'm checking here if we have any
> hardware like this.

http://shrek.krogh.cc/~jesper/dmidecode.txt


> are you running ubuntu 9.10 or something else?

We're on Ubuntu 8.04 but we basically "only" use a filesystem, kernel
with a well-performing NFS-server and some NIC that work well :-)

> Wow, thats a monster machine, 8 nodes, 128GB ram.  Can we get a full
> lspci -vvv output, as well as ethtool -e eth7 and eth8

http://shrek.krogh.cc/~jesper/lspci.txt

http://shrek.krogh.cc/~jesper/ethtool-eth7.txt
http://shrek.krogh.cc/~jesper/ethtool-eth8.txt

> 32 has ixgbe with a known issue of multiple mappings on transmit
> possibly causing some problems, could it be that you're running into
> this?  can you apply commit e5a43549f7a58509a91b299a51337d386697b92c
> and see if it fixes your issue?

I'll do that the next time I can push a reboot through.

-- 
Jesper
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

WARNING: multiple messages have this Message-ID (diff)
From: Jesper Krogh <jesper-Q2TZfHgGEy4@public.gmane.org>
To: Jesse Brandeburg <jesse.brandeburg@gmail.com>
Cc: linux-nfs@vger.kernel.org, netdev@vger.kernel.org,
	Jesse Brandeburg <jesse.brandeburg@intel.com>
Subject: Re: ixgbe_clean_tx_irq: tx hang 1 detected, resetting adapter 	(2.6.32.8)
Date: Mon, 22 Feb 2010 19:40:20 +0100	[thread overview]
Message-ID: <4B82CF94.5070903@krogh.cc> (raw)
In-Reply-To: <4807377b1002171453n277cfea3s6d7f3629bd43f674-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

Jesse Brandeburg wrote:
> On Sun, Feb 14, 2010 at 8:29 AM, Jesper Krogh <jesper-Q2TZfHgGEy4@public.gmane.org> wrote:
>> Hi List.
>>
>> I have tried to get a dual bond of 2 x 10G NICs using the
>> Intel Corporation 82598EB 10-Gigabit AT2 Server Adapter (rev 01)
>> going. As first it looked like it "just worked" but when tried to fill
>> the links with data one of the NIC's (eth7) hang and did a reset of
>> itself, so all data was pushed through the other NIC in the bond (eth8)
>>
>> Full dmesg below, but I think the important part is this:
>>
>> [ 2162.745354] ixgbe: eth7: ixgbe_check_tx_hang: Detected Tx Unit Hang
>> [ 2162.745356]   Tx Queue             <4>
>> [ 2162.745356]   TDH, TDT             <e1>, <cc>
>> [ 2162.745357]   next_to_use          <cc>
>> [ 2162.745358]   next_to_clean        <e1>
>> [ 2162.745359] tx_buffer_info[next_to_clean]
>> [ 2162.745359]   time_stamp           <1000713d3>
>> [ 2162.745360]   jiffies              <10007152e>
>> [ 2163.162478] ixgbe: eth7: ixgbe_clean_tx_irq: tx hang 1 detected,
>> resetting adapter
>> [ 2163.357333] bonding: bond0: link status definitely down for interface
>> eth7, disabling it
>> [ 2168.670342] ixgbe: eth7 NIC Link is Up 10 Gbps, Flow Control: None
> 
> Hi Jesper, my first thought was flow control, but I can see you have it off.

I didnt change it, so its default.

> Can we get some more details on the hardware and bios version?  

Sun X4600, 128GB, 16 cores..

> What
> about some dmidecode output.  I'm checking here if we have any
> hardware like this.

http://shrek.krogh.cc/~jesper/dmidecode.txt


> are you running ubuntu 9.10 or something else?

We're on Ubuntu 8.04 but we basically "only" use a filesystem, kernel
with a well-performing NFS-server and some NIC that work well :-)

> Wow, thats a monster machine, 8 nodes, 128GB ram.  Can we get a full
> lspci -vvv output, as well as ethtool -e eth7 and eth8

http://shrek.krogh.cc/~jesper/lspci.txt

http://shrek.krogh.cc/~jesper/ethtool-eth7.txt
http://shrek.krogh.cc/~jesper/ethtool-eth8.txt

> 32 has ixgbe with a known issue of multiple mappings on transmit
> possibly causing some problems, could it be that you're running into
> this?  can you apply commit e5a43549f7a58509a91b299a51337d386697b92c
> and see if it fixes your issue?

I'll do that the next time I can push a reboot through.

-- 
Jesper

  parent reply	other threads:[~2010-02-22 18:40 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-14 16:29 ixgbe_clean_tx_irq: tx hang 1 detected, resetting adapter (2.6.32.8) Jesper Krogh
2010-02-17 22:53 ` Jesse Brandeburg
2010-02-17 22:53   ` Jesse Brandeburg
     [not found]   ` <4807377b1002171453n277cfea3s6d7f3629bd43f674-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-02-18  3:07     ` Yinghai Lu
2010-02-18  3:07       ` Yinghai Lu
2010-02-22 18:40     ` Jesper Krogh [this message]
2010-02-22 18:40       ` Jesper Krogh
2010-03-18  0:44       ` Jesse Brandeburg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B82CF94.5070903@krogh.cc \
    --to=jesper-q2tzfhggey4@public.gmane.org \
    --cc=jesse.brandeburg-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=jesse.brandeburg-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.