All of lore.kernel.org
 help / color / mirror / Atom feed
From: Prashant <prashant@broadcom.com>
To: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Michael Chan <mchan@broadcom.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	David Vrabel <david.vrabel@citrix.com>,
	Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>,
	Vlad Yasevich <vyasevich@gmail.com>,
	<xen-devel@lists.xensource.com>, <netdev@vger.kernel.org>
Subject: Re: tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages]
Date: Sat, 11 Apr 2015 01:01:52 -0700	[thread overview]
Message-ID: <5528D4F0.6060203@broadcom.com> (raw)
In-Reply-To: <21799.59138.666831.970946@mariner.uk.xensource.com>

On 4/10/2015 8:06 AM, Ian Jackson wrote:
> (I switched to a different test box "elbling1" with the same symptoms:
> ~25% packet loss in ping under 64-bit Xen with 32-bit x86 Linux; 100%
> loss Linux x86 32-bit baremetal with `iommu=soft swiotlb=force'.  In
> each case I had disabled the bridge setup so was just using eth0.)
>
> Once again, tcpdumping eth0 with machine booted baremetal with the
> `iommu...' boot options shows corrupted packets on the receive path:
>
> Full transcript below.  The non-corrupted packets (ARP requests) in
> the tcpdump are outgoing: 172.16.144.31 is elbling1.
>
> I think the packets are being dropped by the non-tg3 part of the
> kernel due to their protocol field having been corrupted.

> Also:
>
> root@elbling1:~# ethtool -S eth0 | grep -v ': 0$'
> NIC statistics:
>       rx_octets: 352487
>       rx_ucast_packets: 250
>       rx_mcast_packets: 1165
>       rx_bcast_packets: 1806
>       tx_octets: 15848
>       tx_mcast_packets: 8
>       tx_bcast_packets: 237
> root@elbling1:~# ifconfig eth0
> eth0      Link encap:Ethernet  HWaddr b0:83:fe:db:b6:69
>            inet addr:172.16.144.31  Bcast:172.16.147.255
>            Mask:255.255.252.0
>            inet6 addr: fe80::b283:feff:fedb:b669/64 Scope:Link
>            UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>            RX packets:3245 errors:0 dropped:223 overruns:0 frame:0
>            TX packets:245 errors:0 dropped:0 overruns:0 carrier:0
>            collisions:0 txqueuelen:1000
>            RX bytes:355364 (347.0 KiB)  TX bytes:15848 (15.4 KiB)
>            Interrupt:16
>
> root@elbling1:~#
>
Thanks for the detailed info, looking at the logs it appears sometimes 
the descriptor itself is corrupted(drop count going up due to error bits 
getting set in the descriptor) and some instances the RX data buffer is 
getting corrupted (as seen in the tcpdump).

I tried to reproduce the problem on 32 bit 3.14.34 stable kernel 
baremetal, with iommu=soft swiotlb=force but no luck, no drops or 
errors. I did not try with Xen 64 bit yet. Btw I need a pcie analyzer 
trace to confirm the problem. Is it feasible to capture at your end ?

WARNING: multiple messages have this Message-ID (diff)
From: Prashant <prashant@broadcom.com>
To: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Michael Chan <mchan@broadcom.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	David Vrabel <david.vrabel@citrix.com>,
	Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>,
	Vlad Yasevich <vyasevich@gmail.com>,
	xen-devel@lists.xensource.com, netdev@vger.kernel.org
Subject: Re: tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages]
Date: Sat, 11 Apr 2015 01:01:52 -0700	[thread overview]
Message-ID: <5528D4F0.6060203@broadcom.com> (raw)
In-Reply-To: <21799.59138.666831.970946@mariner.uk.xensource.com>

On 4/10/2015 8:06 AM, Ian Jackson wrote:
> (I switched to a different test box "elbling1" with the same symptoms:
> ~25% packet loss in ping under 64-bit Xen with 32-bit x86 Linux; 100%
> loss Linux x86 32-bit baremetal with `iommu=soft swiotlb=force'.  In
> each case I had disabled the bridge setup so was just using eth0.)
>
> Once again, tcpdumping eth0 with machine booted baremetal with the
> `iommu...' boot options shows corrupted packets on the receive path:
>
> Full transcript below.  The non-corrupted packets (ARP requests) in
> the tcpdump are outgoing: 172.16.144.31 is elbling1.
>
> I think the packets are being dropped by the non-tg3 part of the
> kernel due to their protocol field having been corrupted.

> Also:
>
> root@elbling1:~# ethtool -S eth0 | grep -v ': 0$'
> NIC statistics:
>       rx_octets: 352487
>       rx_ucast_packets: 250
>       rx_mcast_packets: 1165
>       rx_bcast_packets: 1806
>       tx_octets: 15848
>       tx_mcast_packets: 8
>       tx_bcast_packets: 237
> root@elbling1:~# ifconfig eth0
> eth0      Link encap:Ethernet  HWaddr b0:83:fe:db:b6:69
>            inet addr:172.16.144.31  Bcast:172.16.147.255
>            Mask:255.255.252.0
>            inet6 addr: fe80::b283:feff:fedb:b669/64 Scope:Link
>            UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>            RX packets:3245 errors:0 dropped:223 overruns:0 frame:0
>            TX packets:245 errors:0 dropped:0 overruns:0 carrier:0
>            collisions:0 txqueuelen:1000
>            RX bytes:355364 (347.0 KiB)  TX bytes:15848 (15.4 KiB)
>            Interrupt:16
>
> root@elbling1:~#
>
Thanks for the detailed info, looking at the logs it appears sometimes 
the descriptor itself is corrupted(drop count going up due to error bits 
getting set in the descriptor) and some instances the RX data buffer is 
getting corrupted (as seen in the tcpdump).

I tried to reproduce the problem on 32 bit 3.14.34 stable kernel 
baremetal, with iommu=soft swiotlb=force but no luck, no drops or 
errors. I did not try with Xen 64 bit yet. Btw I need a pcie analyzer 
trace to confirm the problem. Is it feasible to capture at your end ?

  reply	other threads:[~2015-04-11  8:01 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-07 15:12 tg3 NIC driver bug in 3.14.x under Xen Ian Jackson
2015-04-07 15:12 ` Ian Jackson
2015-04-07 15:37 ` Konrad Rzeszutek Wilk
2015-04-07 15:37   ` Konrad Rzeszutek Wilk
2015-04-07 18:25   ` Ian Jackson
2015-04-07 18:25     ` Ian Jackson
2015-04-07 16:55 ` Michael Chan
2015-04-07 16:55   ` Michael Chan
2015-04-07 17:58   ` Ian Jackson
2015-04-07 17:58     ` Ian Jackson
2015-04-07 18:13     ` Ian Jackson
2015-04-07 18:13       ` Ian Jackson
2015-04-07 23:21       ` Michael Chan
2015-04-07 23:21         ` Michael Chan
2015-04-07 23:22         ` Prashant Sreedharan
2015-04-07 23:22           ` Prashant Sreedharan
2015-04-08 13:59           ` tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages] Ian Jackson
2015-04-08 13:59             ` Ian Jackson
2015-04-09  1:43             ` Prashant Sreedharan
2015-04-09  1:43               ` Prashant Sreedharan
2015-04-09 11:11               ` Ian Jackson
2015-04-09 11:11                 ` Ian Jackson
2015-04-09 16:10                 ` Prashant Sreedharan
2015-04-09 16:10                   ` Prashant Sreedharan
2015-04-09 16:57                   ` Ian Jackson
2015-04-09 16:57                     ` Ian Jackson
2015-04-09 17:25                     ` Ian Jackson
2015-04-09 17:25                       ` Ian Jackson
2015-04-09 18:08                       ` Prashant Sreedharan
2015-04-09 18:08                         ` Prashant Sreedharan
2015-04-10 15:06                         ` Ian Jackson
2015-04-10 15:06                           ` Ian Jackson
2015-04-11  8:01                           ` Prashant [this message]
2015-04-11  8:01                             ` Prashant
2015-04-15 10:54                             ` Ian Jackson
2015-04-15 10:54                               ` Ian Jackson
2015-04-16  2:53                               ` Prashant
2015-04-16  2:53                                 ` Prashant
2015-04-16 10:18                                 ` Ian Jackson
2015-04-16 10:18                                   ` Ian Jackson
2015-04-16 12:24                                   ` cascardo
2015-04-16 16:39                                     ` Michael Chan
2015-04-16 16:39                                       ` Michael Chan
2015-04-16 17:15                                       ` Ian Jackson
2015-04-16 17:15                                         ` Ian Jackson
2015-04-16 22:51                                         ` Prashant Sreedharan
2015-04-16 22:51                                           ` Prashant Sreedharan
2015-04-17 16:29                                           ` Ian Jackson
2015-04-17 16:29                                             ` Ian Jackson
2015-04-17 17:19                                             ` David Miller
2015-04-17 17:46                                               ` Michael Chan
2015-04-17 17:46                                                 ` Michael Chan
2015-04-17 19:04                                                 ` Konrad Rzeszutek Wilk
2015-04-17 19:12                                                   ` David Miller
2015-04-17 18:52                                                     ` Prashant Sreedharan
2015-04-17 18:52                                                       ` Prashant Sreedharan
2015-04-21 15:05                                                       ` Ian Jackson
2015-04-21 15:05                                                         ` Ian Jackson
2015-04-21 15:36                                                         ` [OSSTEST PATCH] ts-kernel-build: Enable x86 IOMMU options Ian Jackson
2015-04-21 15:44                                                           ` Ian Campbell
2015-04-21 16:51                                                             ` Konrad Rzeszutek Wilk
2015-04-18 12:39                                                   ` [tip:x86/urgent] config: Enable NEED_DMA_MAP_STATE by default when SWIOTLB is selected tip-bot for Konrad Rzeszutek Wilk
2015-04-16 18:14                                       ` tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages] David Miller
2015-04-09 18:26                       ` Michael Chan
2015-04-09 18:26                         ` Michael Chan
2015-04-10 11:43                         ` Ian Jackson
2015-04-10 11:43                           ` Ian Jackson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5528D4F0.6060203@broadcom.com \
    --to=prashant@broadcom.com \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=cascardo@linux.vnet.ibm.com \
    --cc=david.vrabel@citrix.com \
    --cc=konrad.wilk@oracle.com \
    --cc=mchan@broadcom.com \
    --cc=netdev@vger.kernel.org \
    --cc=vyasevich@gmail.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.