All of lore.kernel.org
 help / color / mirror / Atom feed
From: Petr Oros <poros@redhat.com>
To: Jacob Keller <jacob.e.keller@intel.com>, netdev@vger.kernel.org
Cc: jesse.brandeburg@intel.com, anthony.l.nguyen@intel.com,
	davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com,
	intel-wired-lan@lists.osuosl.org, linux-kernel@vger.kernel.org,
	ivecera@redhat.com
Subject: Re: [PATCH] ice: wait for EMP reset after firmware flash
Date: Wed, 13 Apr 2022 17:38:56 +0200	[thread overview]
Message-ID: <8106efcab543ada95ac7ea9e56c47889f7b44f3d.camel@redhat.com> (raw)
In-Reply-To: <092c941b-a057-5cf0-97d8-0c061768dae7@intel.com>

Jacob Keller píše v Út 12. 04. 2022 v 09:58 -0700:
> 
> 
> On 4/12/2022 3:27 AM, Petr Oros wrote:
> > We need to wait for EMP reset after firmware flash.
> > Code was extracted from OOT driver and without this wait
> > fw_activate let
> > card in inconsistent state recoverable only by second
> > flash/activate
> > 
> > Reproducer:
> > [root@host ~]# devlink dev flash pci/0000:ca:00.0 file
> > E810_XXVDA4_FH_O_SEC_FW_1p6p1p9_NVM_3p10_PLDMoMCTP_0.11_8000AD7B.bi
> > n
> > Preparing to flash
> > [fw.mgmt] Erasing
> > [fw.mgmt] Erasing done
> > [fw.mgmt] Flashing 100%
> > [fw.mgmt] Flashing done 100%
> > [fw.undi] Erasing
> > [fw.undi] Erasing done
> > [fw.undi] Flashing 100%
> > [fw.undi] Flashing done 100%
> > [fw.netlist] Erasing
> > [fw.netlist] Erasing done
> > [fw.netlist] Flashing 100%
> > [fw.netlist] Flashing done 100%
> > Activate new firmware by devlink reload
> > [root@host ~]# devlink dev reload pci/0000:ca:00.0 action
> > fw_activate
> > reload_actions_performed:
> >     fw_activate
> > [root@host ~]# ip link show ens7f0
> > 71: ens7f0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq
> > state DOWN mode DEFAULT group default qlen 1000
> >     link/ether b4:96:91:dc:72:e0 brd ff:ff:ff:ff:ff:ff
> >     altname enp202s0f0
> > 
> > dmesg after flash:
> > [   55.120788] ice: Copyright (c) 2018, Intel Corporation.
> > [   55.274734] ice 0000:ca:00.0: Get PHY capabilities failed status
> > = -5, continuing anyway
> > [   55.569797] ice 0000:ca:00.0: The DDP package was successfully
> > loaded: ICE OS Default Package version 1.3.28.0
> > [   55.603629] ice 0000:ca:00.0: Get PHY capability failed.
> > [   55.608951] ice 0000:ca:00.0: ice_init_nvm_phy_type failed: -5
> > [   55.647348] ice 0000:ca:00.0: PTP init successful
> > [   55.675536] ice 0000:ca:00.0: DCB is enabled in the hardware,
> > max number of TCs supported on this port are 8
> > [   55.685365] ice 0000:ca:00.0: FW LLDP is disabled, DCBx/LLDP in
> > SW mode.
> > [   55.692179] ice 0000:ca:00.0: Commit DCB Configuration to the
> > hardware
> > [   55.701382] ice 0000:ca:00.0: 126.024 Gb/s available PCIe
> > bandwidth, limited by 16.0 GT/s PCIe x8 link at 0000:c9:02.0
> > (capable of 252.048 Gb/s with 16.0 GT/s PCIe x16 link)
> > Reboot don't help, only second flash/activate with OOT or patched
> > driver put card back in consistent state
> > 
> > After patch:
> > [root@host ~]# devlink dev flash pci/0000:ca:00.0 file
> > E810_XXVDA4_FH_O_SEC_FW_1p6p1p9_NVM_3p10_PLDMoMCTP_0.11_8000AD7B.bi
> > n
> > Preparing to flash
> > [fw.mgmt] Erasing
> > [fw.mgmt] Erasing done
> > [fw.mgmt] Flashing 100%
> > [fw.mgmt] Flashing done 100%
> > [fw.undi] Erasing
> > [fw.undi] Erasing done
> > [fw.undi] Flashing 100%
> > [fw.undi] Flashing done 100%
> > [fw.netlist] Erasing
> > [fw.netlist] Erasing done
> > [fw.netlist] Flashing 100%
> > [fw.netlist] Flashing done 100%
> > Activate new firmware by devlink reload
> > [root@host ~]# devlink dev reload pci/0000:ca:00.0 action
> > fw_activate
> > reload_actions_performed:
> >     fw_activate
> > [root@host ~]# ip link show ens7f0
> > 19: ens7f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq
> > state UP mode DEFAULT group default qlen 1000
> >     link/ether b4:96:91:dc:72:e0 brd ff:ff:ff:ff:ff:ff
> >     altname enp202s0f0
> > 
> 
> Ahh.. good find. I checked a bunch of places, but didn't check here
> for
> differences. :(
> 
> For what its worth, I checked the source history of the out-of-tree
> driver this came from. It appears to be a workaround added for fixing
> a
> similar issue.
> 
> I haven't been able to dig up the full details yet. It appeares to be
> a
> collision with firmware finalizing recovery after the EMP reset.
> 
> Still trying to dig for any more information I can find.

Interesting time frame could be around this commit:
08771bce330036 ("ice: Continue probe on link/PHY errors")

Petr

> 
> > Fixes: 399e27dbbd9e94 ("ice: support immediate firmware activation
> > via devlink reload")
> > Signed-off-by: Petr Oros <poros@redhat.com>
> > ---
> >  drivers/net/ethernet/intel/ice/ice_main.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/net/ethernet/intel/ice/ice_main.c
> > b/drivers/net/ethernet/intel/ice/ice_main.c
> > index d768925785ca79..90ea2203cdc763 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_main.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> > @@ -6931,12 +6931,15 @@ static void ice_rebuild(struct ice_pf *pf,
> > enum ice_reset_req reset_type)
> >  
> >         dev_dbg(dev, "rebuilding PF after reset_type=%d\n",
> > reset_type);
> >  
> > +#define ICE_EMP_RESET_SLEEP 5000
> >         if (reset_type == ICE_RESET_EMPR) {
> >                 /* If an EMP reset has occurred, any previously
> > pending flash
> >                  * update will have completed. We no longer know
> > whether or
> >                  * not the NVM update EMP reset is restricted.
> >                  */
> >                 pf->fw_emp_reset_disabled = false;
> > +
> > +               msleep(ICE_EMP_RESET_SLEEP);
> >         }
> >  
> >         err = ice_init_all_ctrlq(hw);
> 


WARNING: multiple messages have this Message-ID (diff)
From: Petr Oros <poros@redhat.com>
To: intel-wired-lan@osuosl.org
Subject: [Intel-wired-lan] [PATCH] ice: wait for EMP reset after firmware flash
Date: Wed, 13 Apr 2022 17:38:56 +0200	[thread overview]
Message-ID: <8106efcab543ada95ac7ea9e56c47889f7b44f3d.camel@redhat.com> (raw)
In-Reply-To: <092c941b-a057-5cf0-97d8-0c061768dae7@intel.com>

Jacob Keller p??e v ?t 12. 04. 2022 v 09:58 -0700:
> 
> 
> On 4/12/2022 3:27 AM, Petr Oros wrote:
> > We need to wait for EMP reset after firmware flash.
> > Code was extracted from OOT driver and without this wait
> > fw_activate let
> > card in inconsistent state recoverable only by second
> > flash/activate
> > 
> > Reproducer:
> > [root at host ~]# devlink dev flash pci/0000:ca:00.0 file
> > E810_XXVDA4_FH_O_SEC_FW_1p6p1p9_NVM_3p10_PLDMoMCTP_0.11_8000AD7B.bi
> > n
> > Preparing to flash
> > [fw.mgmt] Erasing
> > [fw.mgmt] Erasing done
> > [fw.mgmt] Flashing 100%
> > [fw.mgmt] Flashing done 100%
> > [fw.undi] Erasing
> > [fw.undi] Erasing done
> > [fw.undi] Flashing 100%
> > [fw.undi] Flashing done 100%
> > [fw.netlist] Erasing
> > [fw.netlist] Erasing done
> > [fw.netlist] Flashing 100%
> > [fw.netlist] Flashing done 100%
> > Activate new firmware by devlink reload
> > [root at host ~]# devlink dev reload pci/0000:ca:00.0 action
> > fw_activate
> > reload_actions_performed:
> > ??? fw_activate
> > [root at host ~]# ip link show ens7f0
> > 71: ens7f0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq
> > state DOWN mode DEFAULT group default qlen 1000
> > ??? link/ether b4:96:91:dc:72:e0 brd ff:ff:ff:ff:ff:ff
> > ??? altname enp202s0f0
> > 
> > dmesg after flash:
> > [?? 55.120788] ice: Copyright (c) 2018, Intel Corporation.
> > [?? 55.274734] ice 0000:ca:00.0: Get PHY capabilities failed status
> > = -5, continuing anyway
> > [?? 55.569797] ice 0000:ca:00.0: The DDP package was successfully
> > loaded: ICE OS Default Package version 1.3.28.0
> > [?? 55.603629] ice 0000:ca:00.0: Get PHY capability failed.
> > [?? 55.608951] ice 0000:ca:00.0: ice_init_nvm_phy_type failed: -5
> > [?? 55.647348] ice 0000:ca:00.0: PTP init successful
> > [?? 55.675536] ice 0000:ca:00.0: DCB is enabled in the hardware,
> > max number of TCs supported on this port are 8
> > [?? 55.685365] ice 0000:ca:00.0: FW LLDP is disabled, DCBx/LLDP in
> > SW mode.
> > [?? 55.692179] ice 0000:ca:00.0: Commit DCB Configuration to the
> > hardware
> > [?? 55.701382] ice 0000:ca:00.0: 126.024 Gb/s available PCIe
> > bandwidth, limited by 16.0 GT/s PCIe x8 link at 0000:c9:02.0
> > (capable of 252.048 Gb/s with 16.0 GT/s PCIe x16 link)
> > Reboot don't help, only second flash/activate with OOT or patched
> > driver put card back in consistent state
> > 
> > After patch:
> > [root at host ~]# devlink dev flash pci/0000:ca:00.0 file
> > E810_XXVDA4_FH_O_SEC_FW_1p6p1p9_NVM_3p10_PLDMoMCTP_0.11_8000AD7B.bi
> > n
> > Preparing to flash
> > [fw.mgmt] Erasing
> > [fw.mgmt] Erasing done
> > [fw.mgmt] Flashing 100%
> > [fw.mgmt] Flashing done 100%
> > [fw.undi] Erasing
> > [fw.undi] Erasing done
> > [fw.undi] Flashing 100%
> > [fw.undi] Flashing done 100%
> > [fw.netlist] Erasing
> > [fw.netlist] Erasing done
> > [fw.netlist] Flashing 100%
> > [fw.netlist] Flashing done 100%
> > Activate new firmware by devlink reload
> > [root at host ~]# devlink dev reload pci/0000:ca:00.0 action
> > fw_activate
> > reload_actions_performed:
> > ??? fw_activate
> > [root at host ~]# ip link show ens7f0
> > 19: ens7f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq
> > state UP mode DEFAULT group default qlen 1000
> > ??? link/ether b4:96:91:dc:72:e0 brd ff:ff:ff:ff:ff:ff
> > ??? altname enp202s0f0
> > 
> 
> Ahh.. good find. I checked a bunch of places, but didn't check here
> for
> differences. :(
> 
> For what its worth, I checked the source history of the out-of-tree
> driver this came from. It appears to be a workaround added for fixing
> a
> similar issue.
> 
> I haven't been able to dig up the full details yet. It appeares to be
> a
> collision with firmware finalizing recovery after the EMP reset.
> 
> Still trying to dig for any more information I can find.

Interesting time frame could be around this commit:
08771bce330036 ("ice: Continue probe on link/PHY errors")

Petr

> 
> > Fixes: 399e27dbbd9e94 ("ice: support immediate firmware activation
> > via devlink reload")
> > Signed-off-by: Petr Oros <poros@redhat.com>
> > ---
> > ?drivers/net/ethernet/intel/ice/ice_main.c | 3 +++
> > ?1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/net/ethernet/intel/ice/ice_main.c
> > b/drivers/net/ethernet/intel/ice/ice_main.c
> > index d768925785ca79..90ea2203cdc763 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_main.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> > @@ -6931,12 +6931,15 @@ static void ice_rebuild(struct ice_pf *pf,
> > enum ice_reset_req reset_type)
> > ?
> > ????????dev_dbg(dev, "rebuilding PF after reset_type=%d\n",
> > reset_type);
> > ?
> > +#define ICE_EMP_RESET_SLEEP 5000
> > ????????if (reset_type == ICE_RESET_EMPR) {
> > ????????????????/* If an EMP reset has occurred, any previously
> > pending flash
> > ???????????????? * update will have completed. We no longer know
> > whether or
> > ???????????????? * not the NVM update EMP reset is restricted.
> > ???????????????? */
> > ????????????????pf->fw_emp_reset_disabled = false;
> > +
> > +???????????????msleep(ICE_EMP_RESET_SLEEP);
> > ????????}
> > ?
> > ????????err = ice_init_all_ctrlq(hw);
> 


  reply	other threads:[~2022-04-13 15:39 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-12 10:27 [PATCH] ice: wait for EMP reset after firmware flash Petr Oros
2022-04-12 10:27 ` [Intel-wired-lan] " Petr Oros
2022-04-12 13:28 ` Paul Menzel
2022-04-12 13:28   ` Paul Menzel
2022-04-29 11:32   ` [Intel-wired-lan] [PATCH] ice: wait for EMP reset after firmware flash #forregzbot Thorsten Leemhuis
2022-04-29 11:32     ` Thorsten Leemhuis
2022-04-12 16:08 ` [Intel-wired-lan] [PATCH] ice: wait for EMP reset after firmware flash Alexander Lobakin
2022-04-12 16:08   ` Alexander Lobakin
2022-04-12 17:04   ` Jacob Keller
2022-04-12 18:29   ` Jesse Brandeburg
2022-04-12 18:29     ` Jesse Brandeburg
2022-04-12 16:58 ` Jacob Keller
2022-04-12 16:58   ` [Intel-wired-lan] " Jacob Keller
2022-04-13 15:38   ` Petr Oros [this message]
2022-04-13 15:38     ` Petr Oros
2022-04-13 15:37 ` [PATCH v2] ice: wait 5 s " Petr Oros
2022-04-13 15:37   ` [Intel-wired-lan] " Petr Oros
2022-04-14 12:16   ` G, GurucharanX
2022-04-14 12:16     ` G, GurucharanX

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8106efcab543ada95ac7ea9e56c47889f7b44f3d.camel@redhat.com \
    --to=poros@redhat.com \
    --cc=anthony.l.nguyen@intel.com \
    --cc=davem@davemloft.net \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=ivecera@redhat.com \
    --cc=jacob.e.keller@intel.com \
    --cc=jesse.brandeburg@intel.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.