All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ido Schimmel <idosch@idosch.org>
To: Andrew Lunn <andrew@lunn.ch>
Cc: netdev@vger.kernel.org, davem@davemloft.net, kuba@kernel.org,
	mkubecek@suse.cz, pali@kernel.org, vadimp@nvidia.com,
	mlxsw@nvidia.com, Ido Schimmel <idosch@nvidia.com>
Subject: Re: [RFC PATCH net-next 2/8] ethtool: Add ability to reset transceiver modules
Date: Tue, 10 Aug 2021 16:05:07 +0300	[thread overview]
Message-ID: <YRJ5g/W11V0mjKHs@shredder> (raw)
In-Reply-To: <YRF+a6C/wHa7+2Gs@lunn.ch>

On Mon, Aug 09, 2021 at 09:13:47PM +0200, Andrew Lunn wrote:
> On Mon, Aug 09, 2021 at 01:21:46PM +0300, Ido Schimmel wrote:
> > From: Ido Schimmel <idosch@nvidia.com>
> > 
> > Add a new ethtool message, 'ETHTOOL_MSG_MODULE_RESET_ACT', which allows
> > user space to request a reset of transceiver modules. A successful reset
> > results in a notification being emitted to user space in the form of a
> > 'ETHTOOL_MSG_MODULE_RESET_NTF' message.
> > 
> > Reset can be performed by either asserting the relevant hardware signal
> > ("Reset" in CMIS / "ResetL" in SFF-8636) or by writing to the relevant
> > reset bit in the module's EEPROM (page 00h, byte 26, bit 3 in CMIS /
> > page 00h, byte 93, bit 7 in SFF-8636).
> > 
> > Reset is useful in order to allow a module to transition out of a fault
> > state. From section 6.3.2.12 in CMIS 5.0: "Except for a power cycle, the
> > only exit path from the ModuleFault state is to perform a module reset
> > by taking an action that causes the ResetS transition signal to become
> > TRUE (see Table 6-11)".
> > 
> > To avoid changes to the operational state of the device, reset can only
> > be performed when the device is administratively down.
> > 
> > Example usage:
> > 
> >  # ethtool --reset-module swp11
> >  netlink error: Cannot reset module when port is administratively up
> >  netlink error: Invalid argument
> > 
> >  # ip link set dev swp11 down
> > 
> >  # ethtool --reset-module swp11
> > 
> > Monitor notifications:
> > 
> >  $ ethtool --monitor
> >  listening...
> > 
> >  Module reset done for swp11
> 
> Again, i'm wondering, why is user space doing the reset? Can you think
> of any other piece of hardware where Linux relies on user space
> performing a reset before the kernel can properly use it?
> 
> How long does a reset take? Table 10-1 says the reset pulse must be
> 10uS and table 10-2 says the reset should not take longer than
> 2000ms.

Takes about 1.5ms to get an ACK on the reset request and another few
seconds to ensure module is in a valid operational state (will remove
RTNL in next version).

> So maybe reset it on ifup if it is in a bad state?

We can have multiple ports (split) using the same module and in CMIS
each data path is controlled by a different state machine. Given the
complexity of these modules and possible faults, it is possible to
imagine a situation in which a few ports are fine and the rest are
unable to obtain a carrier.

Resetting the module on ifup of swp1s0 is not intuitive and it shouldn't
affect other split ports (e.g., swp1s1). With the dedicated reset
command we have the ability to enforce all the required restrictions
from the start instead of changing the behavior of existing commands.

> I assume the driver/firmware is monitoring the SFP and if it goes into
> a state which requires a reset it indicates carrier down? Wasn't there
> some patches which added link down reasons? It would make sense to add
> enum ethtool_link_ext_substate_sfp_fault? You can then use ethtool to
> see what state the module is in, and a down/ip should reset it?

I will look into extending the interface with more reasons and parse the
CMIS ModuleFaultCause (see table 8-15) in ethtool(8).

  reply	other threads:[~2021-08-10 13:05 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-09 10:21 [RFC PATCH net-next 0/8] ethtool: Add ability to control transceiver modules Ido Schimmel
2021-08-09 10:21 ` [RFC PATCH net-next 1/8] ethtool: Add ability to control transceiver modules' low power mode Ido Schimmel
2021-08-09 14:28   ` Andrew Lunn
2021-08-10  7:26     ` Ido Schimmel
2021-08-10 13:52       ` Andrew Lunn
2021-08-10 13:59         ` Jakub Kicinski
2021-08-10 20:46           ` Ido Schimmel
2021-08-10 22:00             ` Keller, Jacob E
2021-08-10 22:06               ` Jakub Kicinski
2021-08-10 22:18                 ` Keller, Jacob E
2021-08-10 22:24                 ` Keller, Jacob E
2021-08-10 22:31                 ` Andrew Lunn
2021-08-11  0:38                   ` Keller, Jacob E
2021-08-10 22:05             ` Jakub Kicinski
2021-08-10 22:51               ` Andrew Lunn
2021-08-11 11:33               ` Ido Schimmel
2021-08-11 13:03                 ` Jakub Kicinski
2021-08-11 14:36                   ` Andrew Lunn
2021-08-11 19:37                     ` Ido Schimmel
2021-08-11 20:30                       ` Jakub Kicinski
2021-08-11 20:57                         ` Andrew Lunn
2021-08-11 21:04                         ` Ido Schimmel
2021-08-11 20:42                       ` Andrew Lunn
2021-08-10 21:38           ` Keller, Jacob E
2021-08-09 10:21 ` [RFC PATCH net-next 2/8] ethtool: Add ability to reset transceiver modules Ido Schimmel
2021-08-09 19:13   ` Andrew Lunn
2021-08-10 13:05     ` Ido Schimmel [this message]
2021-08-10 13:54       ` Jakub Kicinski
2021-08-10 18:15         ` Ido Schimmel
2021-08-10 18:58           ` Andrew Lunn
2021-08-10 19:00           ` Jakub Kicinski
2021-08-10 19:28             ` Andrew Lunn
2021-08-10 20:50               ` Ido Schimmel
2021-08-09 10:21 ` [RFC PATCH net-next 3/8] mlxsw: reg: Add fields to PMAOS register Ido Schimmel
2021-08-09 10:21 ` [RFC PATCH net-next 4/8] mlxsw: Make PMAOS pack function more generic Ido Schimmel
2021-08-09 10:21 ` [RFC PATCH net-next 5/8] mlxsw: reg: Add Port Module Memory Map Properties register Ido Schimmel
2021-08-09 10:21 ` [RFC PATCH net-next 6/8] mlxsw: reg: Add Management Cable IO and Notifications register Ido Schimmel
2021-08-09 10:21 ` [RFC PATCH net-next 7/8] mlxsw: Add ability to control transceiver modules' low power mode Ido Schimmel
2021-08-09 10:21 ` [RFC PATCH net-next 8/8] mlxsw: Add ability to reset transceiver modules Ido Schimmel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YRJ5g/W11V0mjKHs@shredder \
    --to=idosch@idosch.org \
    --cc=andrew@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=idosch@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=mkubecek@suse.cz \
    --cc=mlxsw@nvidia.com \
    --cc=netdev@vger.kernel.org \
    --cc=pali@kernel.org \
    --cc=vadimp@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.