From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Fujinaka, Todd" Subject: RE: [Intel-wired-lan] Instability of i40e driver on 4.9 kernel Date: Sat, 21 Oct 2017 00:07:23 +0000 Message-ID: <9B4A1B1917080E46B64F07F2989DADD697A3FFD1@ORSMSX114.amr.corp.intel.com> References: <92118fc4-8a20-f129-193b-9c8fdf81aa24@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 To: Pavlos Parissis , "netdev@vger.kernel.org" , "intel-wired-lan@lists.osuosl.org" Return-path: Received: from mga09.intel.com ([134.134.136.24]:3147 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752498AbdJUAHY (ORCPT ); Fri, 20 Oct 2017 20:07:24 -0400 In-Reply-To: <92118fc4-8a20-f129-193b-9c8fdf81aa24@gmail.com> Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: WW91IHBpY2tlZCBhIGJ1bmNoIG9mIHBsYWNlcyB0byBwb3N0IHRoaXMsIGFuZCB5b3UgcmVhbGx5 IHNob3VsZCd2ZSB1c2VkIGEgZGlmZmVyZW50IHBsYWNlOiBlMTAwMC1kZXZlbEBsaXN0cy5zb3Vy Y2Vmb3JnZS5uZXQNCg0KQWxzbywgc2luY2UgeW91IGZsYWdnZWQgdGhlICJjb21tdW5pdGllcyIg cG9zdCBhcyAiYW5zd2VyZWQiLCB5b3UncmUgbm90IGxpa2VseSB0byBnZXQgYW55IGZvbGxvdy11 cC4gVGhlIEludGVsIGNvbW11bml0aWVzIGFyZSBhbHNvIG5vdCBtb25pdG9yZWQgYXMgbXVjaCBi eSB0aGUgd2lyZWQgbmV0d29ya2luZyBwZW9wbGUgYXQgSW50ZWwuDQoNClBsZWFzZSBsZXQgdXMg a25vdyBpZiB5b3UgaGF2ZSBhbnkgc3BlY2lmaWMgaXNzdWVzLCBhbmQgcGxlYXNlIHByb3ZpZGUg ZXhhY3QgcmVwcm9kdWN0aW9uIHN0ZXBzIHNvIHdlIGNhbiBpbnZlc3RpZ2F0ZSB5b3VyIGlzc3Vl cywgYW5kIHBsZWFzZSB1c2UgZTEwMDAtZGV2ZWwuDQoNClRvZGQgRnVqaW5ha2ENClNvZnR3YXJl IEFwcGxpY2F0aW9uIEVuZ2luZWVyDQpEYXRhY2VudGVyIEVuZ2luZWVyaW5nIEdyb3VwDQpJbnRl bCBDb3Jwb3JhdGlvbg0KdG9kZC5mdWppbmFrYUBpbnRlbC5jb20NCg0KDQotLS0tLU9yaWdpbmFs IE1lc3NhZ2UtLS0tLQ0KRnJvbTogSW50ZWwtd2lyZWQtbGFuIFttYWlsdG86aW50ZWwtd2lyZWQt bGFuLWJvdW5jZXNAb3N1b3NsLm9yZ10gT24gQmVoYWxmIE9mIFBhdmxvcyBQYXJpc3Npcw0KU2Vu dDogVGh1cnNkYXksIE9jdG9iZXIgMTksIDIwMTcgNDowMyBQTQ0KVG86IG5ldGRldkB2Z2VyLmtl cm5lbC5vcmc7IGludGVsLXdpcmVkLWxhbkBsaXN0cy5vc3Vvc2wub3JnDQpTdWJqZWN0OiBbSW50 ZWwtd2lyZWQtbGFuXSBJbnN0YWJpbGl0eSBvZiBpNDBlIGRyaXZlciBvbiA0Ljkga2VybmVsDQoN CkhpIGFsbCwNCg0KV2UgaGF2ZSBiZWVuIHJ1bm5pbmcgNC45IGtlcm5lbHMgZm9yIHNldmVyYWwg bW9udGhzIG9uIENlbnRPUyA3LjMgYW5kIGZvciBmZXcgd2Vla3Mgb24gQ2VudE9TIDcuNCwgYW5k LCBhZnRlciB3ZSByZXBsYWNlZCAxMEdiRSBjb2JiZXIgY2FyZHMoWDU0MC1BVDIgd2l0aCBpeGdi ZSBkcml2ZXIpIHdpdGggWDcxMCAxMEdiRSBTRlAgY2FyZHMgdXNpbmcgaTQwZSBkcml2ZXIsIHdl IG5vdGljZWQgc2V2ZXIgaW5zdGFiaWxpdGllcyBvbiBvdXIgc2VydmVycy4NCg0KT24gc2V2ZXJh bCBzZXJ2ZXJzIHRoZSBsaW5rcyB3ZXJlIG1hcmtlZCBkb3duIGFuZCB1cCBhZ2Fpbiwgd2l0aG91 dCBhbnkgb2J2aW91cyByZWFzb25zIGV4cGVjdCBhIGxvdCBvZiBlcnJvcnMgb24ga2VybmVsLmxv Zy4gV2UgcnVuIEJpcmQgSW50ZXJuZXQgZGFlbW9uIG9uIG91ciBzZXJ2ZXJzIGluIG9yZGVyIHRv IGVzdGFibGlzaCBCR1AgcGVlcmluZ3Mgd2l0aCByb3V0ZXJzIGFuZCB3ZSBoYXZlIG9ic2VydmVk IGZsYXBwaW5nIG9uIEJHUCBwZWVyaW5ncy4gQXQgdGhlIHNhbWUgdGltZSB3ZSBoYWQgQkdQIHBl ZXJpbmcgc3RhYmlsaXRpZXMgaXNzdWVzIHdlIGhhZCBrZXJuZWwgZXJyb3JzLiBXZSBkZWNpZGVk IHRvIGdvIGJhY2sgdG8gMy4xMCBrZXJuZWwgZnJvbSBDZW50T1MsIGJ1dCB0aGF0IHByb2Nlc3Mg d2Fzbid0IHNtb290aCBhcyBsYXRlc3QgZmlybXdhcmUgZ2F2ZSB1cyBwcm9ibGVtcyB3aXRoIHNw ZWVkIGRldGVjdGlvbi4gV2Ugcm9sbGVkIGJhY2sgdG8gdHdvIHZlcnNpb24gb2xkIGFuZCBzcGVl ZCBkZXRlY3Rpb24gaXNzdWUgd2FzIHJlc29sdmVkLiBXZSBoYXZlIGJlZW4gcnVubmluZyAzLjEw IHNldmVyYWwgd2Vla3Mgd2l0aG91dCBhbnkgcHJvYmxlbXMuIEV2ZW4gd2Ugd2FudCBjZXJ0YWlu IGZ1bmN0aW9uYWxpdHkgZnJvbSBrZXJuZWwgNC45LCB3ZSBkZWNpZGVkIHRvIHN3aXRjaCBiYWNr IHRvIDMuMTAgYXMgc3RhYmlsaXR5IG9mIG91ciBzeXN0ZW1zIGhhcyBoaWdoZXIgcHJpb3JpdHku DQoNCkkgbmVlZCB0byBtZW50aW9uIHRoYXQgaW4gYWxsIG9jY3VycmVuY2VzIG9mIHRoZSBpc3N1 ZSB3ZSBkaWRuJ3Qgc2VlIGFueSBhbm9tYWxpZXMsIHN1Y2ggRERPUyBhdHRhY2tzIGFuZCBldGMu DQoNCkkgaGF2ZSBvcGVuZWQgaHR0cHM6Ly9jb21tdW5pdGllcy5pbnRlbC5jb20vbWVzc2FnZS81 MDE2ODIjNTAxNjgyIGFuZCB0aGVyZSB5b3UgY2FuIGZpbmQgYWxsIHRoZSBlcnJvciBtZXNzYWdl cyBhbmQgb3RoZXIgaW5mb3JtYXRpb24uDQoNClNpbmNlIHdlIG5vdGljZWQgdGhlIGlzc3Vlcywg SSBoYXZlIGJlZW4gZm9sbG93aW5nIG5ldGRldiBNTCBhbmQgSSBrbm93IHRoYXQgdGhlcmUgYXJl IGEgbG90IG9mIGltcHJvdmVtZW50cy9wYXRjaGVkIHF1ZXVlZCB1cCBmb3IgNC4xNCBhbmQgSSBh bSBob3BpbmcgdGhvc2UgcGF0Y2hlcyBmaXggb3VyIGlzc3VlIGFuZCBtb3N0IGltcG9ydGFudGx5 IGFyZSBzZW50IHRvIGxpbnV4LXN0YWJsZSBmb3IgaW5jbHVzaW9uIGluIDQuOSBrZXJuZWwuDQoN CkNoZWVycywNClBhdmxvcw0KDQoNCg== From mboxrd@z Thu Jan 1 00:00:00 1970 From: Fujinaka, Todd Date: Sat, 21 Oct 2017 00:07:23 +0000 Subject: [Intel-wired-lan] Instability of i40e driver on 4.9 kernel In-Reply-To: <92118fc4-8a20-f129-193b-9c8fdf81aa24@gmail.com> References: <92118fc4-8a20-f129-193b-9c8fdf81aa24@gmail.com> Message-ID: <9B4A1B1917080E46B64F07F2989DADD697A3FFD1@ORSMSX114.amr.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: intel-wired-lan@osuosl.org List-ID: You picked a bunch of places to post this, and you really should've used a different place: e1000-devel at lists.sourceforge.net Also, since you flagged the "communities" post as "answered", you're not likely to get any follow-up. The Intel communities are also not monitored as much by the wired networking people at Intel. Please let us know if you have any specific issues, and please provide exact reproduction steps so we can investigate your issues, and please use e1000-devel. Todd Fujinaka Software Application Engineer Datacenter Engineering Group Intel Corporation todd.fujinaka at intel.com -----Original Message----- From: Intel-wired-lan [mailto:intel-wired-lan-bounces at osuosl.org] On Behalf Of Pavlos Parissis Sent: Thursday, October 19, 2017 4:03 PM To: netdev@vger.kernel.org; intel-wired-lan at lists.osuosl.org Subject: [Intel-wired-lan] Instability of i40e driver on 4.9 kernel Hi all, We have been running 4.9 kernels for several months on CentOS 7.3 and for few weeks on CentOS 7.4, and, after we replaced 10GbE cobber cards(X540-AT2 with ixgbe driver) with X710 10GbE SFP cards using i40e driver, we noticed sever instabilities on our servers. On several servers the links were marked down and up again, without any obvious reasons expect a lot of errors on kernel.log. We run Bird Internet daemon on our servers in order to establish BGP peerings with routers and we have observed flapping on BGP peerings. At the same time we had BGP peering stabilities issues we had kernel errors. We decided to go back to 3.10 kernel from CentOS, but that process wasn't smooth as latest firmware gave us problems with speed detection. We rolled back to two version old and speed detection issue was resolved. We have been running 3.10 several weeks without any problems. Even we want certain functionality from kernel 4.9, we decided to switch back to 3.10 as stability of our systems has higher priority. I need to mention that in all occurrences of the issue we didn't see any anomalies, such DDOS attacks and etc. I have opened https://communities.intel.com/message/501682#501682 and there you can find all the error messages and other information. Since we noticed the issues, I have been following netdev ML and I know that there are a lot of improvements/patched queued up for 4.14 and I am hoping those patches fix our issue and most importantly are sent to linux-stable for inclusion in 4.9 kernel. Cheers, Pavlos