From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9E01FC10F13 for ; Thu, 11 Apr 2019 08:56:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 5D19C2184B for ; Thu, 11 Apr 2019 08:56:42 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="IaJWOkRa" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726638AbfDKI4l (ORCPT ); Thu, 11 Apr 2019 04:56:41 -0400 Received: from mail-vk1-f196.google.com ([209.85.221.196]:45704 "EHLO mail-vk1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725793AbfDKI4l (ORCPT ); Thu, 11 Apr 2019 04:56:41 -0400 Received: by mail-vk1-f196.google.com with SMTP id h127so1192487vkd.12 for ; Thu, 11 Apr 2019 01:56:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=LNNHJ3rdeo8803JLFtxTyu+9zUbOkBGX7Ewc+KzwBZA=; b=IaJWOkRaqj54t4YLl1Mpl3OPtcb8gDZ+ZsWH6Rd+dMUnauIvFK+sHTyHb64OO8vgKn 6pf80BUbqysEZjNmsgFmkum/45d+VG5hhRh19V9uSXeWYae0sCv9rIWsm1+uhzhgjOsF MZHE8/NrruO3A9gbpl81Pxc8i82xZAoqMATSKElWe6v0vbrr3QfcPgJhAPuTja0S46WP Ej2o/h9X7bAjC5iSMYWeomIJf6XNIrZ/ZPAE617tOCs/oiIcoff7ierfR7o+brTNfbIU urkN89TZaEvsV3Ic0SFOq6us4pOoAkB0DMFRj0YpU+CD/ztcDxER6OXsrDUQZ1+PDxej xSNQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=LNNHJ3rdeo8803JLFtxTyu+9zUbOkBGX7Ewc+KzwBZA=; b=Pmpu1+6OIXXNdKZMcBnvQpuBeSpABzupCXTYvm55MCNbrVbSxJxBfB1loGLGOJmnml w5eXcMOCU4xTs7lT/bLGlN/u8mb3gDt0ff9vBch/OGOGFZp9DcV2M1Z72FBAOIb4X8ev FL3SvF9rvkZJFpKD0WBw26o5GwbY47EuqcFu1DLp0jpjqeExzSWcoL10Lwes5czSkJ2/ bGGJ3ddSYaqDD8um9xvoidHZgFJ1MGEA303q3PMj5tVopuh+ImVTFoWg/x9vI2wuSNhO N3WDkgB1m4kdNHeNtAm3sM3P5ZtuKjHNwABI0JTMX8hm8pl+ofC/3uit6H1CtyzYYRMQ K2vA== X-Gm-Message-State: APjAAAWqYU/U46bbiPpXqaCrTEwCSVpv1XK8JoysBBhhJO3Voof77yLt F3oevMdkWNzvQUqH2YVI3BLsf3CMeSHGmGvkblzMqXt7 X-Google-Smtp-Source: APXvYqxwAtqL7KS+D8wsnqAdaCcDcfHXSiI35LVDPR0sOa5UkSf9RHh0m/cl/zADUnH2JSaSwWS3P6ngfVb/8zKgnSI= X-Received: by 2002:a1f:b712:: with SMTP id h18mr26733830vkf.62.1554972999410; Thu, 11 Apr 2019 01:56:39 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Ian Kumlien Date: Thu, 11 Apr 2019 10:56:28 +0200 Message-ID: Subject: Re: bnx2x - odd behaviour To: Sudarsana Reddy Kalluru Cc: Linux Kernel Network Developers , Ariel Elior Content-Type: text/plain; charset="UTF-8" Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Thu, Apr 4, 2019 at 4:27 PM Sudarsana Reddy Kalluru wrote: > > Hi, > We are not aware of this issue. Please collect the register dump i.e., "ethtool -d " output when this issue happens (before performing link-flap) and share it for the analysis. I haven't been able to recreate the original issue, but i just had something completely new happen that might be related. FYI, this is old HP blade servers using a pass-trough module (and they can be dodgy at times)... I brought up the second nic to enable network redundancy and the machine crashed (could only see the tail of it as is) but the interesting bit is that it wouldn't boot properly - resulting in the picture below https://photos.app.goo.gl/pyKEnu9qLLfvGeXC6 I don't know how useful this is, if at all, but it does seem like it is in a incorrect state - a cold boot fixed it. Still trying to recreate the original issue.... > Thanks, > Sudarsana > > -----Original Message----- > > From: netdev-owner@vger.kernel.org On > > Behalf Of Ian Kumlien > > Sent: Wednesday, April 3, 2019 8:31 PM > > To: Linux Kernel Network Developers ; Ariel Elior > > ; Sudarsana Reddy Kalluru > > Subject: bnx2x - odd behaviour > > > > Hi, > > > > We just had this happen on 5.0.2 > > > > It looks like the interface went down, ended up in a broken state and a ip li > > set down/up dev enp2s0f0 made it work again > > > > It looks really weird and I haven't really seen anything like it, anyone with a > > clue? > > > > dmesg: > > .... > > [1310361.808694] bnx2x 0000:02:00.0 enp2s0f0: NIC Link is Down > > [1310361.824554] bond0: link status down for active interface enp2s0f0, > > disabling it in 1000 ms [1310362.872678] bond0: link status definitely down > > for interface enp2s0f0, disabling it [1310362.880691] device enp2s0f0 left > > promiscuous mode [1310363.188592] bnx2x 0000:02:00.0 enp2s0f0: NIC Link > > is Up, 10000 Mbps full duplex, Flow control: ON - transmit [1310363.200653] > > bond0: link status up for interface enp2s0f0, enabling it in 0 ms > > [1310363.208192] bond0: link status definitely up for interface enp2s0f0, > > 10000 Mbps full duplex [1310363.216885] bond0: making interface enp2s0f0 > > the new active one [1310363.223075] device enp2s0f0 entered promiscuous > > mode [1310363.228613] bond0: first active interface up! > > [1310364.048805] bnx2x: > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > > 0x04000000 (masked) > > [1310364.058297] bnx2x: [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC > > time-out 0x08004384 > > [1310365.072604] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer > > max (1) [1310366.096679] bnx2x: > > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (2) [1310366.103922] > > bnx2x: > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > > 0x04000000 (masked) > > [1310366.113387] bnx2x: [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC > > time-out 0x08004384 > > [1310367.120518] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer > > max (3) [1310368.144635] bnx2x: > > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (4) [1310369.168591] > > bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (5) > > [1310371.216519] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer > > max (6) ... it does go on ... > > [1312156.028230] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer > > max (1520) [1312157.052226] bnx2x: > > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (1521) > > [1312157.059842] bnx2x: > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > > 0x04000000 (masked) > > [1312157.069242] bnx2x: [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC > > time-out 0x08004384 > > [1312158.076261] bnx2x: > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > > 0x04000000 (masked) > > [1312158.085657] bnx2x: [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC > > time-out 0x08004384 > > [1312159.100154] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer > > max (1522) [1312160.124226] bnx2x: > > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (1523) > > [1312161.148127] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer > > max (1524) [1312162.172102] bnx2x: > > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (1525) > > [1312163.196000] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer > > max (1526) [1312163.203610] bnx2x: > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > > 0x04000000 (masked) > > [1312163.213082] bnx2x: [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC > > time-out 0x08004384 > > [1312164.220248] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer > > max (1527) [1312165.244119] bnx2x: > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > > 0x04000000 (masked) > > [1312165.253524] bnx2x: [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC > > time-out 0x08004384 > > [1312166.268053] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer > > max (1528) [1312167.292105] bnx2x: > > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (1529) > > [1312168.316022] bnx2x: [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer > > max (1530) [1312169.340014] bnx2x: > > [bnx2x_hw_stats_update:869(enp2s0f0)]NIG timer max (1531) > > [1312169.347584] bnx2x: > > [bnx2x_attn_int_deasserted3:4357(enp2s0f0)]LATCHED attention > > 0x04000000 (masked) > > [1312169.357054] bnx2x: [bnx2x_attn_int_deasserted3:4361(enp2s0f0)]GRC > > time-out 0x08004384 > > > > ... trying to bing it down ... > > > > [1312169.659992] bond0: link status down for active interface enp2s0f0, > > disabling it in 1000 ms [1312169.672041] bond0: link status down for active > > interface enp2s0f0, disabling it in 1000 ms [1312169.682084] bond0: link > > status down for active interface enp2s0f0, disabling it in 1000 ms > > [1312169.692159] bond0: link status down for active interface enp2s0f0, > > disabling it in 1000 ms [1312169.702026] bond0: link status down for active > > interface enp2s0f0, disabling it in 1000 ms [1312169.712081] bond0: link > > status down for active interface enp2s0f0, disabling it in 1000 ms > > [1312169.722097] bond0: link status down for active interface enp2s0f0, > > disabling it in 1000 ms [1312169.732073] bond0: link status down for active > > interface enp2s0f0, disabling it in 1000 ms [1312169.742079] bond0: link > > status down for active interface enp2s0f0, disabling it in 1000 ms > > [1312169.752066] bond0: link status down for active interface enp2s0f0, > > disabling it in 1000 ms [1312169.762017] bond0: link status down for active > > interface enp2s0f0, disabling it in 1000 ms [1312169.771958] bond0: link > > status down for active interface enp2s0f0, disabling it in 1000 ms > > [1312169.782085] bond0: link status down for active interface enp2s0f0, > > disabling it in 1000 ms .... on and on ... > > [1312170.434045] bond0: link status down for active interface enp2s0f0, > > disabling it in 1000 ms [1312170.444012] bond0: link status down for active > > interface enp2s0f0, disabling it in 1000 ms [1312170.454024] bond0: link > > status down for active interface enp2s0f0, disabling it in 1000 ms > > [1312170.463879] bond0: link status down for active interface enp2s0f0, > > disabling it in 1000 ms [1312170.473950] bond0: link status down for active > > interface enp2s0f0, disabling it in 1000 ms [1312170.484107] bond0: link > > status down for active interface enp2s0f0, disabling it in 1000 ms > > [1312171.532119] bond0: link status definitely down for interface enp2s0f0, > > disabling it > > > > ... bringing it up again ... > > > > [1312171.540128] device enp2s0f0 left promiscuous mode [1312189.213375] > > bnx2x 0000:02:00.0 enp2s0f0: using MSI-X IRQs: sp 42 fp[0] 44 ... fp[7] 51 > > [1312190.780919] bnx2x 0000:02:00.0 enp2s0f0: NIC Link is Up, 10000 Mbps > > full duplex, Flow control: ON - transmit [1312190.787840] bond0: link status > > up for interface enp2s0f0, enabling it in 0 ms [1312190.798618] bond0: link > > status definitely up for interface enp2s0f0, 10000 Mbps full duplex > > [1312190.807307] bond0: making interface enp2s0f0 the new active one > > [1312190.813560] device enp2s0f0 entered promiscuous mode > > [1312190.820884] bond0: first active interface up! > > ---