* bnx2x - odd behaviour @ 2019-04-03 15:00 Ian Kumlien 2019-04-04 14:27 ` Sudarsana Reddy Kalluru 0 siblings, 1 reply; 13+ messages in thread From: Ian Kumlien @ 2019-04-03 15:00 UTC (permalink / raw) To: Linux Kernel Network Developers, aelior, skalluru Hi, We just had this happen on 5.0.2 It looks like the interface went down, ended up in a broken state and a ip li set down/up dev enp2s0f0 made it work again It looks really weird and I haven't really seen anything like it, anyone with a clue? dmesg: .... [1310361.808694] bnx2x 0000:02:00.0 enp2s0f0: NIC Link is Down [1310361.824554] bond0: link status down for active interface enp2s0f0, disabling it in 1000 ms [1310362.872678] bond0: link status definitely down for interface enp2s0f0, disabling it [1310362.880691] device enp2s0f0 left promiscuous mode [1310363.188592] bnx2x 0000:02:00.0 enp2s0f0: NIC Link is Up, 10000 Mbps full duplex, Flow control: ON - transmit [1310363.200653] bond0: link status up for interface enp2s0f0, enabling it in 0 ms [1310363.208192] bond0: link status definitely up for interface enp2s0f0, 10000 Mbps full duplex [1310363.216885] bond0: making interface enp2s0f0 the new active one [1310363.223075] device enp2s0f0 entered promiscuous mode [1310363.228613] bond0: first active interface up! [1310364.048805] bnx2x: [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention 0x04000000 (masked) [1310364.058297] bnx2x: [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 [1310365.072604] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (1) [1310366.096679] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (2) [1310366.103922] bnx2x: [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention 0x04000000 (masked) [1310366.113387] bnx2x: [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 [1310367.120518] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (3) [1310368.144635] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (4) [1310369.168591] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (5) [1310371.216519] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (6) ... it does go on ... [1312156.028230] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (1520) [1312157.052226] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (1521) [1312157.059842] bnx2x: [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention 0x04000000 (masked) [1312157.069242] bnx2x: [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 [1312158.076261] bnx2x: [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention 0x04000000 (masked) [1312158.085657] bnx2x: [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 [1312159.100154] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (1522) [1312160.124226] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (1523) [1312161.148127] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (1524) [1312162.172102] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (1525) [1312163.196000] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (1526) [1312163.203610] bnx2x: [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention 0x04000000 (masked) [1312163.213082] bnx2x: [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 [1312164.220248] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (1527) [1312165.244119] bnx2x: [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention 0x04000000 (masked) [1312165.253524] bnx2x: [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 [1312166.268053] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (1528) [1312167.292105] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (1529) [1312168.316022] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (1530) [1312169.340014] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (1531) [1312169.347584] bnx2x: [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention 0x04000000 (masked) [1312169.357054] bnx2x: [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 ... trying to bing it down ... [1312169.659992] bond0: link status down for active interface enp2s0f0, disabling it in 1000 ms [1312169.672041] bond0: link status down for active interface enp2s0f0, disabling it in 1000 ms [1312169.682084] bond0: link status down for active interface enp2s0f0, disabling it in 1000 ms [1312169.692159] bond0: link status down for active interface enp2s0f0, disabling it in 1000 ms [1312169.702026] bond0: link status down for active interface enp2s0f0, disabling it in 1000 ms [1312169.712081] bond0: link status down for active interface enp2s0f0, disabling it in 1000 ms [1312169.722097] bond0: link status down for active interface enp2s0f0, disabling it in 1000 ms [1312169.732073] bond0: link status down for active interface enp2s0f0, disabling it in 1000 ms [1312169.742079] bond0: link status down for active interface enp2s0f0, disabling it in 1000 ms [1312169.752066] bond0: link status down for active interface enp2s0f0, disabling it in 1000 ms [1312169.762017] bond0: link status down for active interface enp2s0f0, disabling it in 1000 ms [1312169.771958] bond0: link status down for active interface enp2s0f0, disabling it in 1000 ms [1312169.782085] bond0: link status down for active interface enp2s0f0, disabling it in 1000 ms .... on and on ... [1312170.434045] bond0: link status down for active interface enp2s0f0, disabling it in 1000 ms [1312170.444012] bond0: link status down for active interface enp2s0f0, disabling it in 1000 ms [1312170.454024] bond0: link status down for active interface enp2s0f0, disabling it in 1000 ms [1312170.463879] bond0: link status down for active interface enp2s0f0, disabling it in 1000 ms [1312170.473950] bond0: link status down for active interface enp2s0f0, disabling it in 1000 ms [1312170.484107] bond0: link status down for active interface enp2s0f0, disabling it in 1000 ms [1312171.532119] bond0: link status definitely down for interface enp2s0f0, disabling it ... bringing it up again ... [1312171.540128] device enp2s0f0 left promiscuous mode [1312189.213375] bnx2x 0000:02:00.0 enp2s0f0: using MSI-X IRQs: sp 42 fp[0] 44 ... fp[7] 51 [1312190.780919] bnx2x 0000:02:00.0 enp2s0f0: NIC Link is Up, 10000 Mbps full duplex, Flow control: ON - transmit [1312190.787840] bond0: link status up for interface enp2s0f0, enabling it in 0 ms [1312190.798618] bond0: link status definitely up for interface enp2s0f0, 10000 Mbps full duplex [1312190.807307] bond0: making interface enp2s0f0 the new active one [1312190.813560] device enp2s0f0 entered promiscuous mode [1312190.820884] bond0: first active interface up! --- ^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: bnx2x - odd behaviour 2019-04-03 15:00 bnx2x - odd behaviour Ian Kumlien @ 2019-04-04 14:27 ` Sudarsana Reddy Kalluru 2019-04-11 8:56 ` Ian Kumlien 0 siblings, 1 reply; 13+ messages in thread From: Sudarsana Reddy Kalluru @ 2019-04-04 14:27 UTC (permalink / raw) To: Ian Kumlien, Linux Kernel Network Developers, Ariel Elior Hi, We are not aware of this issue. Please collect the register dump i.e., "ethtool -d <interface>" output when this issue happens (before performing link-flap) and share it for the analysis. Thanks, Sudarsana > -----Original Message----- > From: netdev-owner@vger.kernel.org <netdev-owner@vger.kernel.org> On > Behalf Of Ian Kumlien > Sent: Wednesday, April 3, 2019 8:31 PM > To: Linux Kernel Network Developers <netdev@vger.kernel.org>; Ariel Elior > <aelior@marvell.com>; Sudarsana Reddy Kalluru <skalluru@marvell.com> > Subject: bnx2x - odd behaviour > > Hi, > > We just had this happen on 5.0.2 > > It looks like the interface went down, ended up in a broken state and a ip li > set down/up dev enp2s0f0 made it work again > > It looks really weird and I haven't really seen anything like it, anyone with a > clue? > > dmesg: > .... > [1310361.808694] bnx2x 0000:02:00.0 enp2s0f0: NIC Link is Down > [1310361.824554] bond0: link status down for active interface enp2s0f0, > disabling it in 1000 ms [1310362.872678] bond0: link status definitely down > for interface enp2s0f0, disabling it [1310362.880691] device enp2s0f0 left > promiscuous mode [1310363.188592] bnx2x 0000:02:00.0 enp2s0f0: NIC Link > is Up, 10000 Mbps full duplex, Flow control: ON - transmit [1310363.200653] > bond0: link status up for interface enp2s0f0, enabling it in 0 ms > [1310363.208192] bond0: link status definitely up for interface enp2s0f0, > 10000 Mbps full duplex [1310363.216885] bond0: making interface enp2s0f0 > the new active one [1310363.223075] device enp2s0f0 entered promiscuous > mode [1310363.228613] bond0: first active interface up! > [1310364.048805] bnx2x: > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > 0x04000000 (masked) > [1310364.058297] bnx2x: [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC > time-out 0x08004384 > [1310365.072604] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer > max (1) [1310366.096679] bnx2x: > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (2) [1310366.103922] > bnx2x: > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > 0x04000000 (masked) > [1310366.113387] bnx2x: [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC > time-out 0x08004384 > [1310367.120518] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer > max (3) [1310368.144635] bnx2x: > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (4) [1310369.168591] > bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (5) > [1310371.216519] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer > max (6) ... it does go on ... > [1312156.028230] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer > max (1520) [1312157.052226] bnx2x: > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (1521) > [1312157.059842] bnx2x: > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > 0x04000000 (masked) > [1312157.069242] bnx2x: [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC > time-out 0x08004384 > [1312158.076261] bnx2x: > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > 0x04000000 (masked) > [1312158.085657] bnx2x: [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC > time-out 0x08004384 > [1312159.100154] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer > max (1522) [1312160.124226] bnx2x: > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (1523) > [1312161.148127] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer > max (1524) [1312162.172102] bnx2x: > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (1525) > [1312163.196000] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer > max (1526) [1312163.203610] bnx2x: > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > 0x04000000 (masked) > [1312163.213082] bnx2x: [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC > time-out 0x08004384 > [1312164.220248] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer > max (1527) [1312165.244119] bnx2x: > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > 0x04000000 (masked) > [1312165.253524] bnx2x: [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC > time-out 0x08004384 > [1312166.268053] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer > max (1528) [1312167.292105] bnx2x: > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (1529) > [1312168.316022] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer > max (1530) [1312169.340014] bnx2x: > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (1531) > [1312169.347584] bnx2x: > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > 0x04000000 (masked) > [1312169.357054] bnx2x: [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC > time-out 0x08004384 > > ... trying to bing it down ... > > [1312169.659992] bond0: link status down for active interface enp2s0f0, > disabling it in 1000 ms [1312169.672041] bond0: link status down for active > interface enp2s0f0, disabling it in 1000 ms [1312169.682084] bond0: link > status down for active interface enp2s0f0, disabling it in 1000 ms > [1312169.692159] bond0: link status down for active interface enp2s0f0, > disabling it in 1000 ms [1312169.702026] bond0: link status down for active > interface enp2s0f0, disabling it in 1000 ms [1312169.712081] bond0: link > status down for active interface enp2s0f0, disabling it in 1000 ms > [1312169.722097] bond0: link status down for active interface enp2s0f0, > disabling it in 1000 ms [1312169.732073] bond0: link status down for active > interface enp2s0f0, disabling it in 1000 ms [1312169.742079] bond0: link > status down for active interface enp2s0f0, disabling it in 1000 ms > [1312169.752066] bond0: link status down for active interface enp2s0f0, > disabling it in 1000 ms [1312169.762017] bond0: link status down for active > interface enp2s0f0, disabling it in 1000 ms [1312169.771958] bond0: link > status down for active interface enp2s0f0, disabling it in 1000 ms > [1312169.782085] bond0: link status down for active interface enp2s0f0, > disabling it in 1000 ms .... on and on ... > [1312170.434045] bond0: link status down for active interface enp2s0f0, > disabling it in 1000 ms [1312170.444012] bond0: link status down for active > interface enp2s0f0, disabling it in 1000 ms [1312170.454024] bond0: link > status down for active interface enp2s0f0, disabling it in 1000 ms > [1312170.463879] bond0: link status down for active interface enp2s0f0, > disabling it in 1000 ms [1312170.473950] bond0: link status down for active > interface enp2s0f0, disabling it in 1000 ms [1312170.484107] bond0: link > status down for active interface enp2s0f0, disabling it in 1000 ms > [1312171.532119] bond0: link status definitely down for interface enp2s0f0, > disabling it > > ... bringing it up again ... > > [1312171.540128] device enp2s0f0 left promiscuous mode [1312189.213375] > bnx2x 0000:02:00.0 enp2s0f0: using MSI-X IRQs: sp 42 fp[0] 44 ... fp[7] 51 > [1312190.780919] bnx2x 0000:02:00.0 enp2s0f0: NIC Link is Up, 10000 Mbps > full duplex, Flow control: ON - transmit [1312190.787840] bond0: link status > up for interface enp2s0f0, enabling it in 0 ms [1312190.798618] bond0: link > status definitely up for interface enp2s0f0, 10000 Mbps full duplex > [1312190.807307] bond0: making interface enp2s0f0 the new active one > [1312190.813560] device enp2s0f0 entered promiscuous mode > [1312190.820884] bond0: first active interface up! > --- ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: bnx2x - odd behaviour 2019-04-04 14:27 ` Sudarsana Reddy Kalluru @ 2019-04-11 8:56 ` Ian Kumlien 2019-04-12 9:14 ` Ian Kumlien 0 siblings, 1 reply; 13+ messages in thread From: Ian Kumlien @ 2019-04-11 8:56 UTC (permalink / raw) To: Sudarsana Reddy Kalluru; +Cc: Linux Kernel Network Developers, Ariel Elior On Thu, Apr 4, 2019 at 4:27 PM Sudarsana Reddy Kalluru <skalluru@marvell.com> wrote: > > Hi, > We are not aware of this issue. Please collect the register dump i.e., "ethtool -d <interface>" output when this issue happens (before performing link-flap) and share it for the analysis. I haven't been able to recreate the original issue, but i just had something completely new happen that might be related. FYI, this is old HP blade servers using a pass-trough module (and they can be dodgy at times)... I brought up the second nic to enable network redundancy and the machine crashed (could only see the tail of it as is) but the interesting bit is that it wouldn't boot properly - resulting in the picture below https://photos.app.goo.gl/pyKEnu9qLLfvGeXC6 I don't know how useful this is, if at all, but it does seem like it is in a incorrect state - a cold boot fixed it. Still trying to recreate the original issue.... > Thanks, > Sudarsana > > -----Original Message----- > > From: netdev-owner@vger.kernel.org <netdev-owner@vger.kernel.org> On > > Behalf Of Ian Kumlien > > Sent: Wednesday, April 3, 2019 8:31 PM > > To: Linux Kernel Network Developers <netdev@vger.kernel.org>; Ariel Elior > > <aelior@marvell.com>; Sudarsana Reddy Kalluru <skalluru@marvell.com> > > Subject: bnx2x - odd behaviour > > > > Hi, > > > > We just had this happen on 5.0.2 > > > > It looks like the interface went down, ended up in a broken state and a ip li > > set down/up dev enp2s0f0 made it work again > > > > It looks really weird and I haven't really seen anything like it, anyone with a > > clue? > > > > dmesg: > > .... > > [1310361.808694] bnx2x 0000:02:00.0 enp2s0f0: NIC Link is Down > > [1310361.824554] bond0: link status down for active interface enp2s0f0, > > disabling it in 1000 ms [1310362.872678] bond0: link status definitely down > > for interface enp2s0f0, disabling it [1310362.880691] device enp2s0f0 left > > promiscuous mode [1310363.188592] bnx2x 0000:02:00.0 enp2s0f0: NIC Link > > is Up, 10000 Mbps full duplex, Flow control: ON - transmit [1310363.200653] > > bond0: link status up for interface enp2s0f0, enabling it in 0 ms > > [1310363.208192] bond0: link status definitely up for interface enp2s0f0, > > 10000 Mbps full duplex [1310363.216885] bond0: making interface enp2s0f0 > > the new active one [1310363.223075] device enp2s0f0 entered promiscuous > > mode [1310363.228613] bond0: first active interface up! > > [1310364.048805] bnx2x: > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > > 0x04000000 (masked) > > [1310364.058297] bnx2x: [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC > > time-out 0x08004384 > > [1310365.072604] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer > > max (1) [1310366.096679] bnx2x: > > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (2) [1310366.103922] > > bnx2x: > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > > 0x04000000 (masked) > > [1310366.113387] bnx2x: [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC > > time-out 0x08004384 > > [1310367.120518] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer > > max (3) [1310368.144635] bnx2x: > > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (4) [1310369.168591] > > bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (5) > > [1310371.216519] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer > > max (6) ... it does go on ... > > [1312156.028230] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer > > max (1520) [1312157.052226] bnx2x: > > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (1521) > > [1312157.059842] bnx2x: > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > > 0x04000000 (masked) > > [1312157.069242] bnx2x: [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC > > time-out 0x08004384 > > [1312158.076261] bnx2x: > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > > 0x04000000 (masked) > > [1312158.085657] bnx2x: [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC > > time-out 0x08004384 > > [1312159.100154] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer > > max (1522) [1312160.124226] bnx2x: > > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (1523) > > [1312161.148127] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer > > max (1524) [1312162.172102] bnx2x: > > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (1525) > > [1312163.196000] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer > > max (1526) [1312163.203610] bnx2x: > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > > 0x04000000 (masked) > > [1312163.213082] bnx2x: [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC > > time-out 0x08004384 > > [1312164.220248] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer > > max (1527) [1312165.244119] bnx2x: > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > > 0x04000000 (masked) > > [1312165.253524] bnx2x: [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC > > time-out 0x08004384 > > [1312166.268053] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer > > max (1528) [1312167.292105] bnx2x: > > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (1529) > > [1312168.316022] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer > > max (1530) [1312169.340014] bnx2x: > > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (1531) > > [1312169.347584] bnx2x: > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > > 0x04000000 (masked) > > [1312169.357054] bnx2x: [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC > > time-out 0x08004384 > > > > ... trying to bing it down ... > > > > [1312169.659992] bond0: link status down for active interface enp2s0f0, > > disabling it in 1000 ms [1312169.672041] bond0: link status down for active > > interface enp2s0f0, disabling it in 1000 ms [1312169.682084] bond0: link > > status down for active interface enp2s0f0, disabling it in 1000 ms > > [1312169.692159] bond0: link status down for active interface enp2s0f0, > > disabling it in 1000 ms [1312169.702026] bond0: link status down for active > > interface enp2s0f0, disabling it in 1000 ms [1312169.712081] bond0: link > > status down for active interface enp2s0f0, disabling it in 1000 ms > > [1312169.722097] bond0: link status down for active interface enp2s0f0, > > disabling it in 1000 ms [1312169.732073] bond0: link status down for active > > interface enp2s0f0, disabling it in 1000 ms [1312169.742079] bond0: link > > status down for active interface enp2s0f0, disabling it in 1000 ms > > [1312169.752066] bond0: link status down for active interface enp2s0f0, > > disabling it in 1000 ms [1312169.762017] bond0: link status down for active > > interface enp2s0f0, disabling it in 1000 ms [1312169.771958] bond0: link > > status down for active interface enp2s0f0, disabling it in 1000 ms > > [1312169.782085] bond0: link status down for active interface enp2s0f0, > > disabling it in 1000 ms .... on and on ... > > [1312170.434045] bond0: link status down for active interface enp2s0f0, > > disabling it in 1000 ms [1312170.444012] bond0: link status down for active > > interface enp2s0f0, disabling it in 1000 ms [1312170.454024] bond0: link > > status down for active interface enp2s0f0, disabling it in 1000 ms > > [1312170.463879] bond0: link status down for active interface enp2s0f0, > > disabling it in 1000 ms [1312170.473950] bond0: link status down for active > > interface enp2s0f0, disabling it in 1000 ms [1312170.484107] bond0: link > > status down for active interface enp2s0f0, disabling it in 1000 ms > > [1312171.532119] bond0: link status definitely down for interface enp2s0f0, > > disabling it > > > > ... bringing it up again ... > > > > [1312171.540128] device enp2s0f0 left promiscuous mode [1312189.213375] > > bnx2x 0000:02:00.0 enp2s0f0: using MSI-X IRQs: sp 42 fp[0] 44 ... fp[7] 51 > > [1312190.780919] bnx2x 0000:02:00.0 enp2s0f0: NIC Link is Up, 10000 Mbps > > full duplex, Flow control: ON - transmit [1312190.787840] bond0: link status > > up for interface enp2s0f0, enabling it in 0 ms [1312190.798618] bond0: link > > status definitely up for interface enp2s0f0, 10000 Mbps full duplex > > [1312190.807307] bond0: making interface enp2s0f0 the new active one > > [1312190.813560] device enp2s0f0 entered promiscuous mode > > [1312190.820884] bond0: first active interface up! > > --- ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: bnx2x - odd behaviour 2019-04-11 8:56 ` Ian Kumlien @ 2019-04-12 9:14 ` Ian Kumlien 2019-04-12 10:53 ` Sudarsana Reddy Kalluru 0 siblings, 1 reply; 13+ messages in thread From: Ian Kumlien @ 2019-04-12 9:14 UTC (permalink / raw) To: Sudarsana Reddy Kalluru; +Cc: Linux Kernel Network Developers, Ariel Elior Finally! Just had a machine with the same issue! On Thu, Apr 11, 2019 at 10:56 AM Ian Kumlien <ian.kumlien@gmail.com> wrote: > > On Thu, Apr 4, 2019 at 4:27 PM Sudarsana Reddy Kalluru > <skalluru@marvell.com> wrote: > > > > Hi, > > We are not aware of this issue. Please collect the register dump i.e., "ethtool -d <interface>" output when this issue happens (before performing link-flap) and share it for the analysis. Sent the dump separately :) ^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: bnx2x - odd behaviour 2019-04-12 9:14 ` Ian Kumlien @ 2019-04-12 10:53 ` Sudarsana Reddy Kalluru 2019-04-12 11:08 ` Ian Kumlien 0 siblings, 1 reply; 13+ messages in thread From: Sudarsana Reddy Kalluru @ 2019-04-12 10:53 UTC (permalink / raw) To: Ian Kumlien; +Cc: Linux Kernel Network Developers, Ariel Elior Hi Ian, Thanks for your info/help. There's not much info in the logs (e.g., FW traces, calltraces). Will contact our firmware team on the register-dump analysis and provide you the update. Thanks, Sudarsana > -----Original Message----- > From: Ian Kumlien <ian.kumlien@gmail.com> > Sent: Friday, April 12, 2019 2:44 PM > To: Sudarsana Reddy Kalluru <skalluru@marvell.com> > Cc: Linux Kernel Network Developers <netdev@vger.kernel.org>; Ariel Elior > <aelior@marvell.com> > Subject: Re: bnx2x - odd behaviour > > Finally! > > Just had a machine with the same issue! > > On Thu, Apr 11, 2019 at 10:56 AM Ian Kumlien <ian.kumlien@gmail.com> > wrote: > > > > On Thu, Apr 4, 2019 at 4:27 PM Sudarsana Reddy Kalluru > > <skalluru@marvell.com> wrote: > > > > > > Hi, > > > We are not aware of this issue. Please collect the register dump i.e., > "ethtool -d <interface>" output when this issue happens (before performing > link-flap) and share it for the analysis. > > Sent the dump separately :) ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: bnx2x - odd behaviour 2019-04-12 10:53 ` Sudarsana Reddy Kalluru @ 2019-04-12 11:08 ` Ian Kumlien 2019-04-17 7:58 ` Sudarsana Reddy Kalluru 0 siblings, 1 reply; 13+ messages in thread From: Ian Kumlien @ 2019-04-12 11:08 UTC (permalink / raw) To: Sudarsana Reddy Kalluru; +Cc: Linux Kernel Network Developers, Ariel Elior On Fri, Apr 12, 2019 at 12:53 PM Sudarsana Reddy Kalluru <skalluru@marvell.com> wrote: > > Hi Ian, > Thanks for your info/help. There's not much info in the logs (e.g., FW traces, calltraces). Will contact our firmware team on the register-dump analysis and provide you the update. Thank you =) > Thanks, > Sudarsana > > -----Original Message----- > > From: Ian Kumlien <ian.kumlien@gmail.com> > > Sent: Friday, April 12, 2019 2:44 PM > > To: Sudarsana Reddy Kalluru <skalluru@marvell.com> > > Cc: Linux Kernel Network Developers <netdev@vger.kernel.org>; Ariel Elior > > <aelior@marvell.com> > > Subject: Re: bnx2x - odd behaviour > > > > Finally! > > > > Just had a machine with the same issue! > > > > On Thu, Apr 11, 2019 at 10:56 AM Ian Kumlien <ian.kumlien@gmail.com> > > wrote: > > > > > > On Thu, Apr 4, 2019 at 4:27 PM Sudarsana Reddy Kalluru > > > <skalluru@marvell.com> wrote: > > > > > > > > Hi, > > > > We are not aware of this issue. Please collect the register dump i.e., > > "ethtool -d <interface>" output when this issue happens (before performing > > link-flap) and share it for the analysis. > > > > Sent the dump separately :) ^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: bnx2x - odd behaviour 2019-04-12 11:08 ` Ian Kumlien @ 2019-04-17 7:58 ` Sudarsana Reddy Kalluru 2019-04-17 11:02 ` Ian Kumlien 0 siblings, 1 reply; 13+ messages in thread From: Sudarsana Reddy Kalluru @ 2019-04-17 7:58 UTC (permalink / raw) To: Ian Kumlien; +Cc: Linux Kernel Network Developers, Ariel Elior, Ameen Rahman +Ameen Ian, We couldn't find the root-cause from the logs/register-dump. Could you please load the driver with link-debugs enabled, i.e., modprobe bnx2x debug=0x4 or 'ethtool -s <interface> msglvl 0x4'. And collect the complete kernel logs and the register-dump(collected before performing ifconfig-down). Please also provide the output of "ethtool -i <interface>". Thanks, Sudarsana > -----Original Message----- > From: Ian Kumlien <ian.kumlien@gmail.com> > Sent: Friday, April 12, 2019 4:39 PM > To: Sudarsana Reddy Kalluru <skalluru@marvell.com> > Cc: Linux Kernel Network Developers <netdev@vger.kernel.org>; Ariel Elior > <aelior@marvell.com> > Subject: Re: bnx2x - odd behaviour > > On Fri, Apr 12, 2019 at 12:53 PM Sudarsana Reddy Kalluru > <skalluru@marvell.com> wrote: > > > > Hi Ian, > > Thanks for your info/help. There's not much info in the logs (e.g., FW > traces, calltraces). Will contact our firmware team on the register-dump > analysis and provide you the update. > > Thank you =) > > > Thanks, > > Sudarsana > > > -----Original Message----- > > > From: Ian Kumlien <ian.kumlien@gmail.com> > > > Sent: Friday, April 12, 2019 2:44 PM > > > To: Sudarsana Reddy Kalluru <skalluru@marvell.com> > > > Cc: Linux Kernel Network Developers <netdev@vger.kernel.org>; Ariel > > > Elior <aelior@marvell.com> > > > Subject: Re: bnx2x - odd behaviour > > > > > > Finally! > > > > > > Just had a machine with the same issue! > > > > > > On Thu, Apr 11, 2019 at 10:56 AM Ian Kumlien <ian.kumlien@gmail.com> > > > wrote: > > > > > > > > On Thu, Apr 4, 2019 at 4:27 PM Sudarsana Reddy Kalluru > > > > <skalluru@marvell.com> wrote: > > > > > > > > > > Hi, > > > > > We are not aware of this issue. Please collect the register > > > > > dump i.e., > > > "ethtool -d <interface>" output when this issue happens (before > > > performing > > > link-flap) and share it for the analysis. > > > > > > Sent the dump separately :) ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: bnx2x - odd behaviour 2019-04-17 7:58 ` Sudarsana Reddy Kalluru @ 2019-04-17 11:02 ` Ian Kumlien 2019-04-17 13:05 ` Sudarsana Reddy Kalluru 0 siblings, 1 reply; 13+ messages in thread From: Ian Kumlien @ 2019-04-17 11:02 UTC (permalink / raw) To: Sudarsana Reddy Kalluru Cc: Linux Kernel Network Developers, Ariel Elior, Ameen Rahman On Wed, Apr 17, 2019 at 9:58 AM Sudarsana Reddy Kalluru <skalluru@marvell.com> wrote: > > +Ameen > > Ian, > We couldn't find the root-cause from the logs/register-dump. > Could you please load the driver with link-debugs enabled, i.e., modprobe bnx2x debug=0x4 or 'ethtool -s <interface> msglvl 0x4'. And collect the complete kernel logs and the register-dump(collected before performing ifconfig-down). Please also provide the output of "ethtool -i <interface>". I'll try, this is a production system... Could it be related to the gro changes for UDP that was done in 5.x? > Thanks, > Sudarsana > > -----Original Message----- > > From: Ian Kumlien <ian.kumlien@gmail.com> > > Sent: Friday, April 12, 2019 4:39 PM > > To: Sudarsana Reddy Kalluru <skalluru@marvell.com> > > Cc: Linux Kernel Network Developers <netdev@vger.kernel.org>; Ariel Elior > > <aelior@marvell.com> > > Subject: Re: bnx2x - odd behaviour > > > > On Fri, Apr 12, 2019 at 12:53 PM Sudarsana Reddy Kalluru > > <skalluru@marvell.com> wrote: > > > > > > Hi Ian, > > > Thanks for your info/help. There's not much info in the logs (e.g., FW > > traces, calltraces). Will contact our firmware team on the register-dump > > analysis and provide you the update. > > > > Thank you =) > > > > > Thanks, > > > Sudarsana > > > > -----Original Message----- > > > > From: Ian Kumlien <ian.kumlien@gmail.com> > > > > Sent: Friday, April 12, 2019 2:44 PM > > > > To: Sudarsana Reddy Kalluru <skalluru@marvell.com> > > > > Cc: Linux Kernel Network Developers <netdev@vger.kernel.org>; Ariel > > > > Elior <aelior@marvell.com> > > > > Subject: Re: bnx2x - odd behaviour > > > > > > > > Finally! > > > > > > > > Just had a machine with the same issue! > > > > > > > > On Thu, Apr 11, 2019 at 10:56 AM Ian Kumlien <ian.kumlien@gmail.com> > > > > wrote: > > > > > > > > > > On Thu, Apr 4, 2019 at 4:27 PM Sudarsana Reddy Kalluru > > > > > <skalluru@marvell.com> wrote: > > > > > > > > > > > > Hi, > > > > > > We are not aware of this issue. Please collect the register > > > > > > dump i.e., > > > > "ethtool -d <interface>" output when this issue happens (before > > > > performing > > > > link-flap) and share it for the analysis. > > > > > > > > Sent the dump separately :) ^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: bnx2x - odd behaviour 2019-04-17 11:02 ` Ian Kumlien @ 2019-04-17 13:05 ` Sudarsana Reddy Kalluru 2019-04-17 13:20 ` Ian Kumlien 0 siblings, 1 reply; 13+ messages in thread From: Sudarsana Reddy Kalluru @ 2019-04-17 13:05 UTC (permalink / raw) To: Ian Kumlien; +Cc: Linux Kernel Network Developers, Ariel Elior, Ameen Rahman > -----Original Message----- > From: Ian Kumlien <ian.kumlien@gmail.com> > Sent: Wednesday, April 17, 2019 4:32 PM > To: Sudarsana Reddy Kalluru <skalluru@marvell.com> > Cc: Linux Kernel Network Developers <netdev@vger.kernel.org>; Ariel Elior > <aelior@marvell.com>; Ameen Rahman <arahman@marvell.com> > Subject: Re: bnx2x - odd behaviour > > On Wed, Apr 17, 2019 at 9:58 AM Sudarsana Reddy Kalluru > <skalluru@marvell.com> wrote: > > > > +Ameen > > > > Ian, > > We couldn't find the root-cause from the logs/register-dump. > > Could you please load the driver with link-debugs enabled, i.e., modprobe > bnx2x debug=0x4 or 'ethtool -s <interface> msglvl 0x4'. And collect the > complete kernel logs and the register-dump(collected before performing > ifconfig-down). Please also provide the output of "ethtool -i <interface>". > > I'll try, this is a production system... > > Could it be related to the gro changes for UDP that was done in 5.x? > Thanks for your help. I'm not sure if this is related to gro, link related code is handled by different component [management firmware (mfw)]. May be the complete logs/register-dump provide some additional pointers. There were some fixes in the newer version of mfw, getting the mfw version on the chip would help (ethtool -i <interface> provides mfw/boot-code version). > > Thanks, > > Sudarsana > > > -----Original Message----- > > > From: Ian Kumlien <ian.kumlien@gmail.com> > > > Sent: Friday, April 12, 2019 4:39 PM > > > To: Sudarsana Reddy Kalluru <skalluru@marvell.com> > > > Cc: Linux Kernel Network Developers <netdev@vger.kernel.org>; Ariel > > > Elior <aelior@marvell.com> > > > Subject: Re: bnx2x - odd behaviour > > > > > > On Fri, Apr 12, 2019 at 12:53 PM Sudarsana Reddy Kalluru > > > <skalluru@marvell.com> wrote: > > > > > > > > Hi Ian, > > > > Thanks for your info/help. There's not much info in the logs > > > > (e.g., FW > > > traces, calltraces). Will contact our firmware team on the > > > register-dump analysis and provide you the update. > > > > > > Thank you =) > > > > > > > Thanks, > > > > Sudarsana > > > > > -----Original Message----- > > > > > From: Ian Kumlien <ian.kumlien@gmail.com> > > > > > Sent: Friday, April 12, 2019 2:44 PM > > > > > To: Sudarsana Reddy Kalluru <skalluru@marvell.com> > > > > > Cc: Linux Kernel Network Developers <netdev@vger.kernel.org>; > > > > > Ariel Elior <aelior@marvell.com> > > > > > Subject: Re: bnx2x - odd behaviour > > > > > > > > > > Finally! > > > > > > > > > > Just had a machine with the same issue! > > > > > > > > > > On Thu, Apr 11, 2019 at 10:56 AM Ian Kumlien > > > > > <ian.kumlien@gmail.com> > > > > > wrote: > > > > > > > > > > > > On Thu, Apr 4, 2019 at 4:27 PM Sudarsana Reddy Kalluru > > > > > > <skalluru@marvell.com> wrote: > > > > > > > > > > > > > > Hi, > > > > > > > We are not aware of this issue. Please collect the > > > > > > > register dump i.e., > > > > > "ethtool -d <interface>" output when this issue happens (before > > > > > performing > > > > > link-flap) and share it for the analysis. > > > > > > > > > > Sent the dump separately :) ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: bnx2x - odd behaviour 2019-04-17 13:05 ` Sudarsana Reddy Kalluru @ 2019-04-17 13:20 ` Ian Kumlien 2019-04-19 5:23 ` Sudarsana Reddy Kalluru 0 siblings, 1 reply; 13+ messages in thread From: Ian Kumlien @ 2019-04-17 13:20 UTC (permalink / raw) To: Sudarsana Reddy Kalluru Cc: Linux Kernel Network Developers, Ariel Elior, Ameen Rahman On Wed, Apr 17, 2019 at 3:05 PM Sudarsana Reddy Kalluru <skalluru@marvell.com> wrote: > > > -----Original Message----- > > From: Ian Kumlien <ian.kumlien@gmail.com> > > Sent: Wednesday, April 17, 2019 4:32 PM > > To: Sudarsana Reddy Kalluru <skalluru@marvell.com> > > Cc: Linux Kernel Network Developers <netdev@vger.kernel.org>; Ariel Elior > > <aelior@marvell.com>; Ameen Rahman <arahman@marvell.com> > > Subject: Re: bnx2x - odd behaviour > > > > On Wed, Apr 17, 2019 at 9:58 AM Sudarsana Reddy Kalluru > > <skalluru@marvell.com> wrote: > > > > > > +Ameen > > > > > > Ian, > > > We couldn't find the root-cause from the logs/register-dump. > > > Could you please load the driver with link-debugs enabled, i.e., modprobe > > bnx2x debug=0x4 or 'ethtool -s <interface> msglvl 0x4'. And collect the > > complete kernel logs and the register-dump(collected before performing > > ifconfig-down). Please also provide the output of "ethtool -i <interface>". > > > > I'll try, this is a production system... > > > > Could it be related to the gro changes for UDP that was done in 5.x? > > > Thanks for your help. I'm not sure if this is related to gro, link related code is handled by different component [management firmware (mfw)]. May be the complete logs/register-dump provide some additional pointers. There were some fixes in the newer version of mfw, getting the mfw version on the chip would help (ethtool -i <interface> provides mfw/boot-code version). ethtool -i enp2s0f0 driver: bnx2x version: 1.712.30-0 storm 7.13.1.0 firmware-version: bc 6.2.28 phy baa0.105 expansion-rom-version: bus-info: 0000:02:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: yes What we can see in the logs (not with the linkdebug enabled) is: apr 12 06:22:35 localhost kernel: bnx2x 0000:02:00.0 enp2s0f0: NIC Link is Down apr 12 06:22:35 localhost kernel: bond0: link status down for active interface enp2s0f0, disabling it in 1000 ms apr 12 06:22:35 localhost kernel: bnx2x 0000:02:00.0 enp2s0f0: NIC Link is Up, 10000 Mbps full duplex, Flow control: ON - transmit apr 12 06:22:35 localhost kernel: bond0: link status up again after 400 ms for interface enp2s0f0 apr 12 06:22:36 localhost kernel: bnx2x: [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention 0x04000000 (masked) apr 12 06:22:36 localhost kernel: bnx2x: [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 apr 12 06:22:37 localhost kernel: bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (1) apr 12 06:22:37 localhost kernel: bnx2x: [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention 0x04000000 (masked) apr 12 06:22:37 localhost kernel: bnx2x: [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 apr 12 06:22:38 localhost kernel: bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (2) apr 12 06:22:38 localhost kernel: bnx2x: [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention 0x04000000 (masked) apr 12 06:22:38 localhost kernel: bnx2x: [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 apr 12 06:22:39 localhost kernel: bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (3) apr 12 06:22:39 localhost kernel: bnx2x: [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention 0x04000000 (masked) apr 12 06:22:39 localhost kernel: bnx2x: [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 apr 12 06:22:40 localhost kernel: bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (4) apr 12 06:22:40 localhost kernel: bnx2x: [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention 0x04000000 (masked) apr 12 06:22:40 localhost kernel: bnx2x: [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 apr 12 06:22:41 localhost kernel: bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (5) apr 12 06:22:41 localhost kernel: bnx2x: [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention 0x04000000 (masked) apr 12 06:22:41 localhost kernel: bnx2x: [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 apr 12 06:22:42 localhost kernel: bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (6) apr 12 06:22:42 localhost kernel: bnx2x: [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention 0x04000000 (masked) apr 12 06:22:42 localhost kernel: bnx2x: [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 apr 12 06:22:43 localhost kernel: bnx2x: [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention 0x04000000 (masked) apr 12 06:22:43 localhost kernel: bnx2x: [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 apr 12 06:22:44 localhost kernel: bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (7) ... and so it begins =) > > > Thanks, > > > Sudarsana > > > > -----Original Message----- > > > > From: Ian Kumlien <ian.kumlien@gmail.com> > > > > Sent: Friday, April 12, 2019 4:39 PM > > > > To: Sudarsana Reddy Kalluru <skalluru@marvell.com> > > > > Cc: Linux Kernel Network Developers <netdev@vger.kernel.org>; Ariel > > > > Elior <aelior@marvell.com> > > > > Subject: Re: bnx2x - odd behaviour > > > > > > > > On Fri, Apr 12, 2019 at 12:53 PM Sudarsana Reddy Kalluru > > > > <skalluru@marvell.com> wrote: > > > > > > > > > > Hi Ian, > > > > > Thanks for your info/help. There's not much info in the logs > > > > > (e.g., FW > > > > traces, calltraces). Will contact our firmware team on the > > > > register-dump analysis and provide you the update. > > > > > > > > Thank you =) > > > > > > > > > Thanks, > > > > > Sudarsana > > > > > > -----Original Message----- > > > > > > From: Ian Kumlien <ian.kumlien@gmail.com> > > > > > > Sent: Friday, April 12, 2019 2:44 PM > > > > > > To: Sudarsana Reddy Kalluru <skalluru@marvell.com> > > > > > > Cc: Linux Kernel Network Developers <netdev@vger.kernel.org>; > > > > > > Ariel Elior <aelior@marvell.com> > > > > > > Subject: Re: bnx2x - odd behaviour > > > > > > > > > > > > Finally! > > > > > > > > > > > > Just had a machine with the same issue! > > > > > > > > > > > > On Thu, Apr 11, 2019 at 10:56 AM Ian Kumlien > > > > > > <ian.kumlien@gmail.com> > > > > > > wrote: > > > > > > > > > > > > > > On Thu, Apr 4, 2019 at 4:27 PM Sudarsana Reddy Kalluru > > > > > > > <skalluru@marvell.com> wrote: > > > > > > > > > > > > > > > > Hi, > > > > > > > > We are not aware of this issue. Please collect the > > > > > > > > register dump i.e., > > > > > > "ethtool -d <interface>" output when this issue happens (before > > > > > > performing > > > > > > link-flap) and share it for the analysis. > > > > > > > > > > > > Sent the dump separately :) ^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: bnx2x - odd behaviour 2019-04-17 13:20 ` Ian Kumlien @ 2019-04-19 5:23 ` Sudarsana Reddy Kalluru 2019-04-24 14:50 ` Ian Kumlien 0 siblings, 1 reply; 13+ messages in thread From: Sudarsana Reddy Kalluru @ 2019-04-19 5:23 UTC (permalink / raw) To: Ian Kumlien; +Cc: Linux Kernel Network Developers, Ariel Elior, Ameen Rahman Hi Ian, Thanks for your info. Mfw team already analyzed the "nig timer" related logs but can't infer anything. From the boot-code version, the device look to be from the older generation of Broadcom nics. Besides the elink-logs/register-dump, could you also share the lspci output (lspci -vvv). Thanks, Sudarsana > -----Original Message----- > From: Ian Kumlien <ian.kumlien@gmail.com> > Sent: Wednesday, April 17, 2019 6:51 PM > To: Sudarsana Reddy Kalluru <skalluru@marvell.com> > Cc: Linux Kernel Network Developers <netdev@vger.kernel.org>; Ariel Elior > <aelior@marvell.com>; Ameen Rahman <arahman@marvell.com> > Subject: Re: bnx2x - odd behaviour > > On Wed, Apr 17, 2019 at 3:05 PM Sudarsana Reddy Kalluru > <skalluru@marvell.com> wrote: > > > > > -----Original Message----- > > > From: Ian Kumlien <ian.kumlien@gmail.com> > > > Sent: Wednesday, April 17, 2019 4:32 PM > > > To: Sudarsana Reddy Kalluru <skalluru@marvell.com> > > > Cc: Linux Kernel Network Developers <netdev@vger.kernel.org>; Ariel > > > Elior <aelior@marvell.com>; Ameen Rahman <arahman@marvell.com> > > > Subject: Re: bnx2x - odd behaviour > > > > > > On Wed, Apr 17, 2019 at 9:58 AM Sudarsana Reddy Kalluru > > > <skalluru@marvell.com> wrote: > > > > > > > > +Ameen > > > > > > > > Ian, > > > > We couldn't find the root-cause from the logs/register-dump. > > > > Could you please load the driver with link-debugs enabled, i.e., > > > > modprobe > > > bnx2x debug=0x4 or 'ethtool -s <interface> msglvl 0x4'. And collect > > > the complete kernel logs and the register-dump(collected before > > > performing ifconfig-down). Please also provide the output of "ethtool -i > <interface>". > > > > > > I'll try, this is a production system... > > > > > > Could it be related to the gro changes for UDP that was done in 5.x? > > > > > Thanks for your help. I'm not sure if this is related to gro, link related code > is handled by different component [management firmware (mfw)]. May be > the complete logs/register-dump provide some additional pointers. There > were some fixes in the newer version of mfw, getting the mfw version on the > chip would help (ethtool -i <interface> provides mfw/boot-code version). > > ethtool -i enp2s0f0 > driver: bnx2x > version: 1.712.30-0 storm 7.13.1.0 > firmware-version: bc 6.2.28 phy baa0.105 > expansion-rom-version: > bus-info: 0000:02:00.0 > supports-statistics: yes > supports-test: yes > supports-eeprom-access: yes > supports-register-dump: yes > supports-priv-flags: yes > > What we can see in the logs (not with the linkdebug enabled) is: > apr 12 06:22:35 localhost kernel: bnx2x 0000:02:00.0 enp2s0f0: NIC Link is > Down apr 12 06:22:35 localhost kernel: bond0: link status down for active > interface enp2s0f0, disabling it in 1000 ms apr 12 06:22:35 localhost kernel: > bnx2x 0000:02:00.0 enp2s0f0: NIC Link is Up, 10000 Mbps full duplex, Flow > control: ON - transmit apr 12 06:22:35 localhost kernel: bond0: link status up > again after > 400 ms for interface enp2s0f0 > apr 12 06:22:36 localhost kernel: bnx2x: > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > 0x04000000 (masked) > apr 12 06:22:36 localhost kernel: bnx2x: > [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 apr > 12 06:22:37 localhost kernel: bnx2x: > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (1) apr 12 06:22:37 > localhost kernel: bnx2x: > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > 0x04000000 (masked) > apr 12 06:22:37 localhost kernel: bnx2x: > [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 apr > 12 06:22:38 localhost kernel: bnx2x: > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (2) apr 12 06:22:38 > localhost kernel: bnx2x: > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > 0x04000000 (masked) > apr 12 06:22:38 localhost kernel: bnx2x: > [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 apr > 12 06:22:39 localhost kernel: bnx2x: > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (3) apr 12 06:22:39 > localhost kernel: bnx2x: > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > 0x04000000 (masked) > apr 12 06:22:39 localhost kernel: bnx2x: > [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 apr > 12 06:22:40 localhost kernel: bnx2x: > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (4) apr 12 06:22:40 > localhost kernel: bnx2x: > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > 0x04000000 (masked) > apr 12 06:22:40 localhost kernel: bnx2x: > [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 apr > 12 06:22:41 localhost kernel: bnx2x: > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (5) apr 12 06:22:41 > localhost kernel: bnx2x: > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > 0x04000000 (masked) > apr 12 06:22:41 localhost kernel: bnx2x: > [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 apr > 12 06:22:42 localhost kernel: bnx2x: > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (6) apr 12 06:22:42 > localhost kernel: bnx2x: > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > 0x04000000 (masked) > apr 12 06:22:42 localhost kernel: bnx2x: > [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 apr > 12 06:22:43 localhost kernel: bnx2x: > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > 0x04000000 (masked) > apr 12 06:22:43 localhost kernel: bnx2x: > [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 apr > 12 06:22:44 localhost kernel: bnx2x: > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (7) ... and so it > begins =) > > > > > Thanks, > > > > Sudarsana > > > > > -----Original Message----- > > > > > From: Ian Kumlien <ian.kumlien@gmail.com> > > > > > Sent: Friday, April 12, 2019 4:39 PM > > > > > To: Sudarsana Reddy Kalluru <skalluru@marvell.com> > > > > > Cc: Linux Kernel Network Developers <netdev@vger.kernel.org>; > > > > > Ariel Elior <aelior@marvell.com> > > > > > Subject: Re: bnx2x - odd behaviour > > > > > > > > > > On Fri, Apr 12, 2019 at 12:53 PM Sudarsana Reddy Kalluru > > > > > <skalluru@marvell.com> wrote: > > > > > > > > > > > > Hi Ian, > > > > > > Thanks for your info/help. There's not much info in the > > > > > > logs (e.g., FW > > > > > traces, calltraces). Will contact our firmware team on the > > > > > register-dump analysis and provide you the update. > > > > > > > > > > Thank you =) > > > > > > > > > > > Thanks, > > > > > > Sudarsana > > > > > > > -----Original Message----- > > > > > > > From: Ian Kumlien <ian.kumlien@gmail.com> > > > > > > > Sent: Friday, April 12, 2019 2:44 PM > > > > > > > To: Sudarsana Reddy Kalluru <skalluru@marvell.com> > > > > > > > Cc: Linux Kernel Network Developers > > > > > > > <netdev@vger.kernel.org>; Ariel Elior <aelior@marvell.com> > > > > > > > Subject: Re: bnx2x - odd behaviour > > > > > > > > > > > > > > Finally! > > > > > > > > > > > > > > Just had a machine with the same issue! > > > > > > > > > > > > > > On Thu, Apr 11, 2019 at 10:56 AM Ian Kumlien > > > > > > > <ian.kumlien@gmail.com> > > > > > > > wrote: > > > > > > > > > > > > > > > > On Thu, Apr 4, 2019 at 4:27 PM Sudarsana Reddy Kalluru > > > > > > > > <skalluru@marvell.com> wrote: > > > > > > > > > > > > > > > > > > Hi, > > > > > > > > > We are not aware of this issue. Please collect the > > > > > > > > > register dump i.e., > > > > > > > "ethtool -d <interface>" output when this issue happens > > > > > > > (before performing > > > > > > > link-flap) and share it for the analysis. > > > > > > > > > > > > > > Sent the dump separately :) ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: bnx2x - odd behaviour 2019-04-19 5:23 ` Sudarsana Reddy Kalluru @ 2019-04-24 14:50 ` Ian Kumlien 2019-04-25 6:20 ` Sudarsana Reddy Kalluru 0 siblings, 1 reply; 13+ messages in thread From: Ian Kumlien @ 2019-04-24 14:50 UTC (permalink / raw) To: Sudarsana Reddy Kalluru Cc: Linux Kernel Network Developers, Ariel Elior, Ameen Rahman On Fri, Apr 19, 2019 at 7:23 AM Sudarsana Reddy Kalluru <skalluru@marvell.com> wrote: > > Hi Ian, > Thanks for your info. Mfw team already analyzed the "nig timer" related logs but can't infer anything. From the boot-code version, the device look to be from the older generation of Broadcom nics. Besides the elink-logs/register-dump, could you also share the lspci output (lspci -vvv). Yes, this is older machines =) Sorry for the delay in answering, there has been a holiday here, =) lspci output: 02:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme II BCM57711E 10-Gigabit PCIe Subsystem: Hewlett-Packard Company NC532i Dual Port 10GbE Multifunction BL-C Adapter Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 41 Region 0: Memory at fb000000 (64-bit, non-prefetchable) [size=8M] Region 2: Memory at fa800000 (64-bit, non-prefetchable) [size=8M] Capabilities: [48] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst+ PME-Enable+ DSel=0 DScale=1 PME- Capabilities: [50] Vital Product Data Product Name: HP NC532i DP 10GbE Multifunction BL-c Adapter Read-only fields: [PN] Part number: N/A [EC] Engineering changes: N/A [SN] Serial number: 0123456789 [MN] Manufacture ID: 31 34 65 34 [RV] Reserved: checksum good, 39 byte(s) reserved End Capabilities: [58] MSI: Enable- Count=1/8 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [a0] MSI-X: Enable+ Count=17 Masked- Vector table: BAR=0 offset=00440000 PBA: BAR=0 offset=00441800 Capabilities: [ac] Express (v2) Endpoint, MSI 00 DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <1us, L1 <2us ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+ RlxdOrd+ ExtTag- PhantFunc- AuxPwr+ NoSnoop+ MaxPayload 256 bytes, MaxReadReq 4096 bytes DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ TransPend- LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s L1, Exit Latency L0s <2us, L1 <2us ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1- EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- Capabilities: [100 v1] Device Serial Number 44-1e-a1-ff-fe-45-a6-38 Capabilities: [110 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol- UESvrt: DLP- SDES+ TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- Capabilities: [150 v1] Power Budgeting <?> Capabilities: [160 v1] Virtual Channel Caps: LPEVC=0 RefClk=100ns PATEntryBits=1 Arb: Fixed- WRR32- WRR64- WRR128- Ctrl: ArbSelect=Fixed Status: InProgress- VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans- Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256- Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff Status: NegoPending- InProgress- Kernel driver in use: bnx2x Kernel modules: bnx2x --- > Thanks, > Sudarsana > > -----Original Message----- > > From: Ian Kumlien <ian.kumlien@gmail.com> > > Sent: Wednesday, April 17, 2019 6:51 PM > > To: Sudarsana Reddy Kalluru <skalluru@marvell.com> > > Cc: Linux Kernel Network Developers <netdev@vger.kernel.org>; Ariel Elior > > <aelior@marvell.com>; Ameen Rahman <arahman@marvell.com> > > Subject: Re: bnx2x - odd behaviour > > > > On Wed, Apr 17, 2019 at 3:05 PM Sudarsana Reddy Kalluru > > <skalluru@marvell.com> wrote: > > > > > > > -----Original Message----- > > > > From: Ian Kumlien <ian.kumlien@gmail.com> > > > > Sent: Wednesday, April 17, 2019 4:32 PM > > > > To: Sudarsana Reddy Kalluru <skalluru@marvell.com> > > > > Cc: Linux Kernel Network Developers <netdev@vger.kernel.org>; Ariel > > > > Elior <aelior@marvell.com>; Ameen Rahman <arahman@marvell.com> > > > > Subject: Re: bnx2x - odd behaviour > > > > > > > > On Wed, Apr 17, 2019 at 9:58 AM Sudarsana Reddy Kalluru > > > > <skalluru@marvell.com> wrote: > > > > > > > > > > +Ameen > > > > > > > > > > Ian, > > > > > We couldn't find the root-cause from the logs/register-dump. > > > > > Could you please load the driver with link-debugs enabled, i.e., > > > > > modprobe > > > > bnx2x debug=0x4 or 'ethtool -s <interface> msglvl 0x4'. And collect > > > > the complete kernel logs and the register-dump(collected before > > > > performing ifconfig-down). Please also provide the output of "ethtool -i > > <interface>". > > > > > > > > I'll try, this is a production system... > > > > > > > > Could it be related to the gro changes for UDP that was done in 5.x? > > > > > > > Thanks for your help. I'm not sure if this is related to gro, link related code > > is handled by different component [management firmware (mfw)]. May be > > the complete logs/register-dump provide some additional pointers. There > > were some fixes in the newer version of mfw, getting the mfw version on the > > chip would help (ethtool -i <interface> provides mfw/boot-code version). > > > > ethtool -i enp2s0f0 > > driver: bnx2x > > version: 1.712.30-0 storm 7.13.1.0 > > firmware-version: bc 6.2.28 phy baa0.105 > > expansion-rom-version: > > bus-info: 0000:02:00.0 > > supports-statistics: yes > > supports-test: yes > > supports-eeprom-access: yes > > supports-register-dump: yes > > supports-priv-flags: yes > > > > What we can see in the logs (not with the linkdebug enabled) is: > > apr 12 06:22:35 localhost kernel: bnx2x 0000:02:00.0 enp2s0f0: NIC Link is > > Down apr 12 06:22:35 localhost kernel: bond0: link status down for active > > interface enp2s0f0, disabling it in 1000 ms apr 12 06:22:35 localhost kernel: > > bnx2x 0000:02:00.0 enp2s0f0: NIC Link is Up, 10000 Mbps full duplex, Flow > > control: ON - transmit apr 12 06:22:35 localhost kernel: bond0: link status up > > again after > > 400 ms for interface enp2s0f0 > > apr 12 06:22:36 localhost kernel: bnx2x: > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > > 0x04000000 (masked) > > apr 12 06:22:36 localhost kernel: bnx2x: > > [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 apr > > 12 06:22:37 localhost kernel: bnx2x: > > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (1) apr 12 06:22:37 > > localhost kernel: bnx2x: > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > > 0x04000000 (masked) > > apr 12 06:22:37 localhost kernel: bnx2x: > > [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 apr > > 12 06:22:38 localhost kernel: bnx2x: > > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (2) apr 12 06:22:38 > > localhost kernel: bnx2x: > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > > 0x04000000 (masked) > > apr 12 06:22:38 localhost kernel: bnx2x: > > [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 apr > > 12 06:22:39 localhost kernel: bnx2x: > > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (3) apr 12 06:22:39 > > localhost kernel: bnx2x: > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > > 0x04000000 (masked) > > apr 12 06:22:39 localhost kernel: bnx2x: > > [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 apr > > 12 06:22:40 localhost kernel: bnx2x: > > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (4) apr 12 06:22:40 > > localhost kernel: bnx2x: > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > > 0x04000000 (masked) > > apr 12 06:22:40 localhost kernel: bnx2x: > > [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 apr > > 12 06:22:41 localhost kernel: bnx2x: > > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (5) apr 12 06:22:41 > > localhost kernel: bnx2x: > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > > 0x04000000 (masked) > > apr 12 06:22:41 localhost kernel: bnx2x: > > [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 apr > > 12 06:22:42 localhost kernel: bnx2x: > > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (6) apr 12 06:22:42 > > localhost kernel: bnx2x: > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > > 0x04000000 (masked) > > apr 12 06:22:42 localhost kernel: bnx2x: > > [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 apr > > 12 06:22:43 localhost kernel: bnx2x: > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > > 0x04000000 (masked) > > apr 12 06:22:43 localhost kernel: bnx2x: > > [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 apr > > 12 06:22:44 localhost kernel: bnx2x: > > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (7) ... and so it > > begins =) > > > > > > > Thanks, > > > > > Sudarsana > > > > > > -----Original Message----- > > > > > > From: Ian Kumlien <ian.kumlien@gmail.com> > > > > > > Sent: Friday, April 12, 2019 4:39 PM > > > > > > To: Sudarsana Reddy Kalluru <skalluru@marvell.com> > > > > > > Cc: Linux Kernel Network Developers <netdev@vger.kernel.org>; > > > > > > Ariel Elior <aelior@marvell.com> > > > > > > Subject: Re: bnx2x - odd behaviour > > > > > > > > > > > > On Fri, Apr 12, 2019 at 12:53 PM Sudarsana Reddy Kalluru > > > > > > <skalluru@marvell.com> wrote: > > > > > > > > > > > > > > Hi Ian, > > > > > > > Thanks for your info/help. There's not much info in the > > > > > > > logs (e.g., FW > > > > > > traces, calltraces). Will contact our firmware team on the > > > > > > register-dump analysis and provide you the update. > > > > > > > > > > > > Thank you =) > > > > > > > > > > > > > Thanks, > > > > > > > Sudarsana > > > > > > > > -----Original Message----- > > > > > > > > From: Ian Kumlien <ian.kumlien@gmail.com> > > > > > > > > Sent: Friday, April 12, 2019 2:44 PM > > > > > > > > To: Sudarsana Reddy Kalluru <skalluru@marvell.com> > > > > > > > > Cc: Linux Kernel Network Developers > > > > > > > > <netdev@vger.kernel.org>; Ariel Elior <aelior@marvell.com> > > > > > > > > Subject: Re: bnx2x - odd behaviour > > > > > > > > > > > > > > > > Finally! > > > > > > > > > > > > > > > > Just had a machine with the same issue! > > > > > > > > > > > > > > > > On Thu, Apr 11, 2019 at 10:56 AM Ian Kumlien > > > > > > > > <ian.kumlien@gmail.com> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > On Thu, Apr 4, 2019 at 4:27 PM Sudarsana Reddy Kalluru > > > > > > > > > <skalluru@marvell.com> wrote: > > > > > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > We are not aware of this issue. Please collect the > > > > > > > > > > register dump i.e., > > > > > > > > "ethtool -d <interface>" output when this issue happens > > > > > > > > (before performing > > > > > > > > link-flap) and share it for the analysis. > > > > > > > > > > > > > > > > Sent the dump separately :) ^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: bnx2x - odd behaviour 2019-04-24 14:50 ` Ian Kumlien @ 2019-04-25 6:20 ` Sudarsana Reddy Kalluru 0 siblings, 0 replies; 13+ messages in thread From: Sudarsana Reddy Kalluru @ 2019-04-25 6:20 UTC (permalink / raw) To: Ian Kumlien; +Cc: Linux Kernel Network Developers, Ariel Elior, Ameen Rahman Hi Ian, Thanks for the info. BCM57711E is the older version of chip. Could you please recreate with elink-debugs enabled (modprobe bnx2x debug=0x4) and provide the complete logs and the register-dump. Thanks, Sudarsana > -----Original Message----- > From: Ian Kumlien <ian.kumlien@gmail.com> > Sent: Wednesday, April 24, 2019 8:20 PM > To: Sudarsana Reddy Kalluru <skalluru@marvell.com> > Cc: Linux Kernel Network Developers <netdev@vger.kernel.org>; Ariel Elior > <aelior@marvell.com>; Ameen Rahman <arahman@marvell.com> > Subject: Re: bnx2x - odd behaviour > > On Fri, Apr 19, 2019 at 7:23 AM Sudarsana Reddy Kalluru > <skalluru@marvell.com> wrote: > > > > Hi Ian, > > Thanks for your info. Mfw team already analyzed the "nig timer" related > logs but can't infer anything. From the boot-code version, the device look to > be from the older generation of Broadcom nics. Besides the elink- > logs/register-dump, could you also share the lspci output (lspci -vvv). > > Yes, this is older machines =) > > Sorry for the delay in answering, there has been a holiday here, =) > > lspci output: > 02:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme II > BCM57711E 10-Gigabit PCIe > Subsystem: Hewlett-Packard Company NC532i Dual Port 10GbE > Multifunction BL-C Adapter > Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- > ParErr+ Stepping- SERR- FastB2B- DisINTx+ > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- > <TAbort- <MAbort- >SERR- <PERR- INTx- > Latency: 0, Cache Line Size: 64 bytes > Interrupt: pin A routed to IRQ 41 > Region 0: Memory at fb000000 (64-bit, non-prefetchable) [size=8M] > Region 2: Memory at fa800000 (64-bit, non-prefetchable) [size=8M] > Capabilities: [48] Power Management version 3 > Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA > PME(D0+,D1-,D2-,D3hot+,D3cold+) > Status: D0 NoSoftRst+ PME-Enable+ DSel=0 DScale=1 PME- > Capabilities: [50] Vital Product Data > Product Name: HP NC532i DP 10GbE Multifunction BL-c Adapter > Read-only fields: > [PN] Part number: N/A > [EC] Engineering changes: N/A > [SN] Serial number: 0123456789 > [MN] Manufacture ID: 31 34 65 34 > [RV] Reserved: checksum good, 39 byte(s) reserved > End > Capabilities: [58] MSI: Enable- Count=1/8 Maskable- 64bit+ > Address: 0000000000000000 Data: 0000 > Capabilities: [a0] MSI-X: Enable+ Count=17 Masked- > Vector table: BAR=0 offset=00440000 > PBA: BAR=0 offset=00441800 > Capabilities: [ac] Express (v2) Endpoint, MSI 00 > DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <1us, L1 > <2us > ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ > FLReset- SlotPowerLimit 0.000W > DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ > Unsupported+ > RlxdOrd+ ExtTag- PhantFunc- AuxPwr+ NoSnoop+ > MaxPayload 256 bytes, MaxReadReq 4096 bytes > DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ > AuxPwr+ TransPend- > LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s L1, Exit Latency > L0s <2us, L1 <2us > ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp- > LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+ > ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- > LnkSta: Speed 5GT/s, Width x4, TrErr- Train- SlotClk+ > DLActive- BWMgmt- ABWMgmt- > DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, > OBFF Not Supported > DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, > OBFF Disabled > LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis- > Transmit Margin: Normal Operating Range, > EnterModifiedCompliance- ComplianceSOS- > Compliance De-emphasis: -6dB > LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, > EqualizationPhase1- > EqualizationPhase2-, EqualizationPhase3-, > LinkEqualizationRequest- > Capabilities: [100 v1] Device Serial Number 44-1e-a1-ff-fe-45-a6-38 > Capabilities: [110 v1] Advanced Error Reporting > UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- > UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- > UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- > UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol- > UESvrt: DLP- SDES+ TLP- FCP- CmpltTO- CmpltAbrt- > UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- > CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- > CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- > AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- > Capabilities: [150 v1] Power Budgeting <?> > Capabilities: [160 v1] Virtual Channel > Caps: LPEVC=0 RefClk=100ns PATEntryBits=1 > Arb: Fixed- WRR32- WRR64- WRR128- > Ctrl: ArbSelect=Fixed > Status: InProgress- > VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans- > Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256- > Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff > Status: NegoPending- InProgress- > Kernel driver in use: bnx2x > Kernel modules: bnx2x > --- > > > > Thanks, > > Sudarsana > > > -----Original Message----- > > > From: Ian Kumlien <ian.kumlien@gmail.com> > > > Sent: Wednesday, April 17, 2019 6:51 PM > > > To: Sudarsana Reddy Kalluru <skalluru@marvell.com> > > > Cc: Linux Kernel Network Developers <netdev@vger.kernel.org>; Ariel > > > Elior <aelior@marvell.com>; Ameen Rahman <arahman@marvell.com> > > > Subject: Re: bnx2x - odd behaviour > > > > > > On Wed, Apr 17, 2019 at 3:05 PM Sudarsana Reddy Kalluru > > > <skalluru@marvell.com> wrote: > > > > > > > > > -----Original Message----- > > > > > From: Ian Kumlien <ian.kumlien@gmail.com> > > > > > Sent: Wednesday, April 17, 2019 4:32 PM > > > > > To: Sudarsana Reddy Kalluru <skalluru@marvell.com> > > > > > Cc: Linux Kernel Network Developers <netdev@vger.kernel.org>; > > > > > Ariel Elior <aelior@marvell.com>; Ameen Rahman > > > > > <arahman@marvell.com> > > > > > Subject: Re: bnx2x - odd behaviour > > > > > > > > > > On Wed, Apr 17, 2019 at 9:58 AM Sudarsana Reddy Kalluru > > > > > <skalluru@marvell.com> wrote: > > > > > > > > > > > > +Ameen > > > > > > > > > > > > Ian, > > > > > > We couldn't find the root-cause from the logs/register-dump. > > > > > > Could you please load the driver with link-debugs enabled, > > > > > > i.e., modprobe > > > > > bnx2x debug=0x4 or 'ethtool -s <interface> msglvl 0x4'. And > > > > > collect the complete kernel logs and the register-dump(collected > > > > > before performing ifconfig-down). Please also provide the output > > > > > of "ethtool -i > > > <interface>". > > > > > > > > > > I'll try, this is a production system... > > > > > > > > > > Could it be related to the gro changes for UDP that was done in 5.x? > > > > > > > > > Thanks for your help. I'm not sure if this is related to gro, link > > > > related code > > > is handled by different component [management firmware (mfw)]. May > > > be the complete logs/register-dump provide some additional pointers. > > > There were some fixes in the newer version of mfw, getting the mfw > > > version on the chip would help (ethtool -i <interface> provides mfw/boot- > code version). > > > > > > ethtool -i enp2s0f0 > > > driver: bnx2x > > > version: 1.712.30-0 storm 7.13.1.0 > > > firmware-version: bc 6.2.28 phy baa0.105 > > > expansion-rom-version: > > > bus-info: 0000:02:00.0 > > > supports-statistics: yes > > > supports-test: yes > > > supports-eeprom-access: yes > > > supports-register-dump: yes > > > supports-priv-flags: yes > > > > > > What we can see in the logs (not with the linkdebug enabled) is: > > > apr 12 06:22:35 localhost kernel: bnx2x 0000:02:00.0 enp2s0f0: NIC > > > Link is Down apr 12 06:22:35 localhost kernel: bond0: link status > > > down for active interface enp2s0f0, disabling it in 1000 ms apr 12 > 06:22:35 localhost kernel: > > > bnx2x 0000:02:00.0 enp2s0f0: NIC Link is Up, 10000 Mbps full duplex, > > > Flow > > > control: ON - transmit apr 12 06:22:35 localhost kernel: bond0: link > > > status up again after > > > 400 ms for interface enp2s0f0 > > > apr 12 06:22:36 localhost kernel: bnx2x: > > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > > > 0x04000000 (masked) > > > apr 12 06:22:36 localhost kernel: bnx2x: > > > [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 > > > apr > > > 12 06:22:37 localhost kernel: bnx2x: > > > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (1) apr 12 > > > 06:22:37 localhost kernel: bnx2x: > > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > > > 0x04000000 (masked) > > > apr 12 06:22:37 localhost kernel: bnx2x: > > > [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 > > > apr > > > 12 06:22:38 localhost kernel: bnx2x: > > > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (2) apr 12 > > > 06:22:38 localhost kernel: bnx2x: > > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > > > 0x04000000 (masked) > > > apr 12 06:22:38 localhost kernel: bnx2x: > > > [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 > > > apr > > > 12 06:22:39 localhost kernel: bnx2x: > > > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (3) apr 12 > > > 06:22:39 localhost kernel: bnx2x: > > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > > > 0x04000000 (masked) > > > apr 12 06:22:39 localhost kernel: bnx2x: > > > [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 > > > apr > > > 12 06:22:40 localhost kernel: bnx2x: > > > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (4) apr 12 > > > 06:22:40 localhost kernel: bnx2x: > > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > > > 0x04000000 (masked) > > > apr 12 06:22:40 localhost kernel: bnx2x: > > > [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 > > > apr > > > 12 06:22:41 localhost kernel: bnx2x: > > > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (5) apr 12 > > > 06:22:41 localhost kernel: bnx2x: > > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > > > 0x04000000 (masked) > > > apr 12 06:22:41 localhost kernel: bnx2x: > > > [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 > > > apr > > > 12 06:22:42 localhost kernel: bnx2x: > > > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (6) apr 12 > > > 06:22:42 localhost kernel: bnx2x: > > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > > > 0x04000000 (masked) > > > apr 12 06:22:42 localhost kernel: bnx2x: > > > [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 > > > apr > > > 12 06:22:43 localhost kernel: bnx2x: > > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > > > 0x04000000 (masked) > > > apr 12 06:22:43 localhost kernel: bnx2x: > > > [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC time-out 0x08004384 > > > apr > > > 12 06:22:44 localhost kernel: bnx2x: > > > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (7) ... and so it > > > begins =) > > > > > > > > > Thanks, > > > > > > Sudarsana > > > > > > > -----Original Message----- > > > > > > > From: Ian Kumlien <ian.kumlien@gmail.com> > > > > > > > Sent: Friday, April 12, 2019 4:39 PM > > > > > > > To: Sudarsana Reddy Kalluru <skalluru@marvell.com> > > > > > > > Cc: Linux Kernel Network Developers > > > > > > > <netdev@vger.kernel.org>; Ariel Elior <aelior@marvell.com> > > > > > > > Subject: Re: bnx2x - odd behaviour > > > > > > > > > > > > > > On Fri, Apr 12, 2019 at 12:53 PM Sudarsana Reddy Kalluru > > > > > > > <skalluru@marvell.com> wrote: > > > > > > > > > > > > > > > > Hi Ian, > > > > > > > > Thanks for your info/help. There's not much info in the > > > > > > > > logs (e.g., FW > > > > > > > traces, calltraces). Will contact our firmware team on the > > > > > > > register-dump analysis and provide you the update. > > > > > > > > > > > > > > Thank you =) > > > > > > > > > > > > > > > Thanks, > > > > > > > > Sudarsana > > > > > > > > > -----Original Message----- > > > > > > > > > From: Ian Kumlien <ian.kumlien@gmail.com> > > > > > > > > > Sent: Friday, April 12, 2019 2:44 PM > > > > > > > > > To: Sudarsana Reddy Kalluru <skalluru@marvell.com> > > > > > > > > > Cc: Linux Kernel Network Developers > > > > > > > > > <netdev@vger.kernel.org>; Ariel Elior > > > > > > > > > <aelior@marvell.com> > > > > > > > > > Subject: Re: bnx2x - odd behaviour > > > > > > > > > > > > > > > > > > Finally! > > > > > > > > > > > > > > > > > > Just had a machine with the same issue! > > > > > > > > > > > > > > > > > > On Thu, Apr 11, 2019 at 10:56 AM Ian Kumlien > > > > > > > > > <ian.kumlien@gmail.com> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > On Thu, Apr 4, 2019 at 4:27 PM Sudarsana Reddy Kalluru > > > > > > > > > > <skalluru@marvell.com> wrote: > > > > > > > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > We are not aware of this issue. Please collect > > > > > > > > > > > the register dump i.e., > > > > > > > > > "ethtool -d <interface>" output when this issue happens > > > > > > > > > (before performing > > > > > > > > > link-flap) and share it for the analysis. > > > > > > > > > > > > > > > > > > Sent the dump separately :) ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2019-04-25 6:20 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-04-03 15:00 bnx2x - odd behaviour Ian Kumlien 2019-04-04 14:27 ` Sudarsana Reddy Kalluru 2019-04-11 8:56 ` Ian Kumlien 2019-04-12 9:14 ` Ian Kumlien 2019-04-12 10:53 ` Sudarsana Reddy Kalluru 2019-04-12 11:08 ` Ian Kumlien 2019-04-17 7:58 ` Sudarsana Reddy Kalluru 2019-04-17 11:02 ` Ian Kumlien 2019-04-17 13:05 ` Sudarsana Reddy Kalluru 2019-04-17 13:20 ` Ian Kumlien 2019-04-19 5:23 ` Sudarsana Reddy Kalluru 2019-04-24 14:50 ` Ian Kumlien 2019-04-25 6:20 ` Sudarsana Reddy Kalluru
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.