From: Prashant <prashant@broadcom.com> To: Ian Jackson <Ian.Jackson@eu.citrix.com> Cc: Michael Chan <mchan@broadcom.com>, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>, Boris Ostrovsky <boris.ostrovsky@oracle.com>, David Vrabel <david.vrabel@citrix.com>, Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>, Vlad Yasevich <vyasevich@gmail.com>, <xen-devel@lists.xensource.com>, <netdev@vger.kernel.org> Subject: Re: tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages] Date: Sat, 11 Apr 2015 01:01:52 -0700 [thread overview] Message-ID: <5528D4F0.6060203@broadcom.com> (raw) In-Reply-To: <21799.59138.666831.970946@mariner.uk.xensource.com> On 4/10/2015 8:06 AM, Ian Jackson wrote: > (I switched to a different test box "elbling1" with the same symptoms: > ~25% packet loss in ping under 64-bit Xen with 32-bit x86 Linux; 100% > loss Linux x86 32-bit baremetal with `iommu=soft swiotlb=force'. In > each case I had disabled the bridge setup so was just using eth0.) > > Once again, tcpdumping eth0 with machine booted baremetal with the > `iommu...' boot options shows corrupted packets on the receive path: > > Full transcript below. The non-corrupted packets (ARP requests) in > the tcpdump are outgoing: 172.16.144.31 is elbling1. > > I think the packets are being dropped by the non-tg3 part of the > kernel due to their protocol field having been corrupted. > Also: > > root@elbling1:~# ethtool -S eth0 | grep -v ': 0$' > NIC statistics: > rx_octets: 352487 > rx_ucast_packets: 250 > rx_mcast_packets: 1165 > rx_bcast_packets: 1806 > tx_octets: 15848 > tx_mcast_packets: 8 > tx_bcast_packets: 237 > root@elbling1:~# ifconfig eth0 > eth0 Link encap:Ethernet HWaddr b0:83:fe:db:b6:69 > inet addr:172.16.144.31 Bcast:172.16.147.255 > Mask:255.255.252.0 > inet6 addr: fe80::b283:feff:fedb:b669/64 Scope:Link > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > RX packets:3245 errors:0 dropped:223 overruns:0 frame:0 > TX packets:245 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:1000 > RX bytes:355364 (347.0 KiB) TX bytes:15848 (15.4 KiB) > Interrupt:16 > > root@elbling1:~# > Thanks for the detailed info, looking at the logs it appears sometimes the descriptor itself is corrupted(drop count going up due to error bits getting set in the descriptor) and some instances the RX data buffer is getting corrupted (as seen in the tcpdump). I tried to reproduce the problem on 32 bit 3.14.34 stable kernel baremetal, with iommu=soft swiotlb=force but no luck, no drops or errors. I did not try with Xen 64 bit yet. Btw I need a pcie analyzer trace to confirm the problem. Is it feasible to capture at your end ?
WARNING: multiple messages have this Message-ID (diff)
From: Prashant <prashant@broadcom.com> To: Ian Jackson <Ian.Jackson@eu.citrix.com> Cc: Michael Chan <mchan@broadcom.com>, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>, Boris Ostrovsky <boris.ostrovsky@oracle.com>, David Vrabel <david.vrabel@citrix.com>, Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>, Vlad Yasevich <vyasevich@gmail.com>, xen-devel@lists.xensource.com, netdev@vger.kernel.org Subject: Re: tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages] Date: Sat, 11 Apr 2015 01:01:52 -0700 [thread overview] Message-ID: <5528D4F0.6060203@broadcom.com> (raw) In-Reply-To: <21799.59138.666831.970946@mariner.uk.xensource.com> On 4/10/2015 8:06 AM, Ian Jackson wrote: > (I switched to a different test box "elbling1" with the same symptoms: > ~25% packet loss in ping under 64-bit Xen with 32-bit x86 Linux; 100% > loss Linux x86 32-bit baremetal with `iommu=soft swiotlb=force'. In > each case I had disabled the bridge setup so was just using eth0.) > > Once again, tcpdumping eth0 with machine booted baremetal with the > `iommu...' boot options shows corrupted packets on the receive path: > > Full transcript below. The non-corrupted packets (ARP requests) in > the tcpdump are outgoing: 172.16.144.31 is elbling1. > > I think the packets are being dropped by the non-tg3 part of the > kernel due to their protocol field having been corrupted. > Also: > > root@elbling1:~# ethtool -S eth0 | grep -v ': 0$' > NIC statistics: > rx_octets: 352487 > rx_ucast_packets: 250 > rx_mcast_packets: 1165 > rx_bcast_packets: 1806 > tx_octets: 15848 > tx_mcast_packets: 8 > tx_bcast_packets: 237 > root@elbling1:~# ifconfig eth0 > eth0 Link encap:Ethernet HWaddr b0:83:fe:db:b6:69 > inet addr:172.16.144.31 Bcast:172.16.147.255 > Mask:255.255.252.0 > inet6 addr: fe80::b283:feff:fedb:b669/64 Scope:Link > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > RX packets:3245 errors:0 dropped:223 overruns:0 frame:0 > TX packets:245 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:1000 > RX bytes:355364 (347.0 KiB) TX bytes:15848 (15.4 KiB) > Interrupt:16 > > root@elbling1:~# > Thanks for the detailed info, looking at the logs it appears sometimes the descriptor itself is corrupted(drop count going up due to error bits getting set in the descriptor) and some instances the RX data buffer is getting corrupted (as seen in the tcpdump). I tried to reproduce the problem on 32 bit 3.14.34 stable kernel baremetal, with iommu=soft swiotlb=force but no luck, no drops or errors. I did not try with Xen 64 bit yet. Btw I need a pcie analyzer trace to confirm the problem. Is it feasible to capture at your end ?
next prev parent reply other threads:[~2015-04-11 8:01 UTC|newest] Thread overview: 67+ messages / expand[flat|nested] mbox.gz Atom feed top 2015-04-07 15:12 tg3 NIC driver bug in 3.14.x under Xen Ian Jackson 2015-04-07 15:12 ` Ian Jackson 2015-04-07 15:37 ` Konrad Rzeszutek Wilk 2015-04-07 15:37 ` Konrad Rzeszutek Wilk 2015-04-07 18:25 ` Ian Jackson 2015-04-07 18:25 ` Ian Jackson 2015-04-07 16:55 ` Michael Chan 2015-04-07 16:55 ` Michael Chan 2015-04-07 17:58 ` Ian Jackson 2015-04-07 17:58 ` Ian Jackson 2015-04-07 18:13 ` Ian Jackson 2015-04-07 18:13 ` Ian Jackson 2015-04-07 23:21 ` Michael Chan 2015-04-07 23:21 ` Michael Chan 2015-04-07 23:22 ` Prashant Sreedharan 2015-04-07 23:22 ` Prashant Sreedharan 2015-04-08 13:59 ` tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages] Ian Jackson 2015-04-08 13:59 ` Ian Jackson 2015-04-09 1:43 ` Prashant Sreedharan 2015-04-09 1:43 ` Prashant Sreedharan 2015-04-09 11:11 ` Ian Jackson 2015-04-09 11:11 ` Ian Jackson 2015-04-09 16:10 ` Prashant Sreedharan 2015-04-09 16:10 ` Prashant Sreedharan 2015-04-09 16:57 ` Ian Jackson 2015-04-09 16:57 ` Ian Jackson 2015-04-09 17:25 ` Ian Jackson 2015-04-09 17:25 ` Ian Jackson 2015-04-09 18:08 ` Prashant Sreedharan 2015-04-09 18:08 ` Prashant Sreedharan 2015-04-10 15:06 ` Ian Jackson 2015-04-10 15:06 ` Ian Jackson 2015-04-11 8:01 ` Prashant [this message] 2015-04-11 8:01 ` Prashant 2015-04-15 10:54 ` Ian Jackson 2015-04-15 10:54 ` Ian Jackson 2015-04-16 2:53 ` Prashant 2015-04-16 2:53 ` Prashant 2015-04-16 10:18 ` Ian Jackson 2015-04-16 10:18 ` Ian Jackson 2015-04-16 12:24 ` cascardo 2015-04-16 16:39 ` Michael Chan 2015-04-16 16:39 ` Michael Chan 2015-04-16 17:15 ` Ian Jackson 2015-04-16 17:15 ` Ian Jackson 2015-04-16 22:51 ` Prashant Sreedharan 2015-04-16 22:51 ` Prashant Sreedharan 2015-04-17 16:29 ` Ian Jackson 2015-04-17 16:29 ` Ian Jackson 2015-04-17 17:19 ` David Miller 2015-04-17 17:46 ` Michael Chan 2015-04-17 17:46 ` Michael Chan 2015-04-17 19:04 ` Konrad Rzeszutek Wilk 2015-04-17 19:12 ` David Miller 2015-04-17 18:52 ` Prashant Sreedharan 2015-04-17 18:52 ` Prashant Sreedharan 2015-04-21 15:05 ` Ian Jackson 2015-04-21 15:05 ` Ian Jackson 2015-04-21 15:36 ` [OSSTEST PATCH] ts-kernel-build: Enable x86 IOMMU options Ian Jackson 2015-04-21 15:44 ` Ian Campbell 2015-04-21 16:51 ` Konrad Rzeszutek Wilk 2015-04-18 12:39 ` [tip:x86/urgent] config: Enable NEED_DMA_MAP_STATE by default when SWIOTLB is selected tip-bot for Konrad Rzeszutek Wilk 2015-04-16 18:14 ` tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages] David Miller 2015-04-09 18:26 ` Michael Chan 2015-04-09 18:26 ` Michael Chan 2015-04-10 11:43 ` Ian Jackson 2015-04-10 11:43 ` Ian Jackson
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=5528D4F0.6060203@broadcom.com \ --to=prashant@broadcom.com \ --cc=Ian.Jackson@eu.citrix.com \ --cc=boris.ostrovsky@oracle.com \ --cc=cascardo@linux.vnet.ibm.com \ --cc=david.vrabel@citrix.com \ --cc=konrad.wilk@oracle.com \ --cc=mchan@broadcom.com \ --cc=netdev@vger.kernel.org \ --cc=vyasevich@gmail.com \ --cc=xen-devel@lists.xensource.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.