All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Keller, Jacob E" <jacob.e.keller@intel.com>
To: Ido Schimmel <idosch@idosch.org>, Jakub Kicinski <kuba@kernel.org>
Cc: Andrew Lunn <andrew@lunn.ch>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"mkubecek@suse.cz" <mkubecek@suse.cz>,
	"pali@kernel.org" <pali@kernel.org>,
	"vadimp@nvidia.com" <vadimp@nvidia.com>,
	"mlxsw@nvidia.com" <mlxsw@nvidia.com>,
	Ido Schimmel <idosch@nvidia.com>
Subject: Re: [RFC PATCH net-next 1/8] ethtool: Add ability to control transceiver modules' low power mode
Date: Tue, 10 Aug 2021 22:00:51 +0000	[thread overview]
Message-ID: <71a5bd72-2154-a796-37b7-f39afdf2e34d@intel.com> (raw)
In-Reply-To: <YRLlpCutXmthqtOg@shredder>

On 8/10/2021 1:46 PM, Ido Schimmel wrote:
> On Tue, Aug 10, 2021 at 06:59:54AM -0700, Jakub Kicinski wrote:
>> On Tue, 10 Aug 2021 15:52:20 +0200 Andrew Lunn wrote:
>>>> The transition from low power to high power can take a few seconds with
>>>> QSFP/QSFP-DD and it's likely to only get longer with future / more
>>>> complex modules. Therefore, to reduce link-up time, the firmware
>>>> automatically transitions modules to high power mode.
>>>>
>>>> There is obviously a trade-off here between power consumption and
>>>> link-up time. My understanding is that Mellanox is not the only vendor
>>>> favoring shorter link-up times as users have the ability to control the
>>>> low power mode of the modules in other implementations.
>>>>
>>>> Regarding "why do we need user space involved?", by default, it does not
>>>> need to be involved (the system works without this API), but if it wants
>>>> to reduce the power consumption by setting unused modules to low power
>>>> mode, then it will need to use this API.  
>>>
>>> O.K. Thanks for the better explanation. Some of this should go into
>>> the commit message.
>>>
>>> I suggest it gets a different name and semantics, to avoid
>>> confusion. I think we should consider this the default power mode for
>>> when the link is administratively down, rather than direct control
>>> over the modules power mode. The driver should transition the module
>>> to this setting on link down, be it high power or low power. That
>>> saves a lot of complexity, since i assume you currently need a udev
>>> script or something which sets it to low power mode on link down,
>>> where as you can avoid this be configuring the default and let the
>>> driver do it.
>>
>> Good point. And actually NICs have similar knobs, exposed via ethtool
>> priv flags today. Intel NICs for example. Maybe we should create a
>> "really power the port down policy" API?
> 
> See below about Intel. I'm not sure it's the same thing...
> 
> I'm against adding a vague "really power the port down policy" API. The
> API proposed in the patch is well-defined, its implementation is
> documented in standards, its implications are clear and we offer APIs
> that give user space full observability into its operation.
> 
> A vague API means that it is going to be abused and user space will get
> different results over different implementations. After reading the
> *commit messages* about the private flags, I'm not sure what the flags
> really do, what is their true motivation, implications or how do I get
> observability into their operation. I'm not too hopeful about the user
> documentation.
> 
> Also, like I mentioned in the cover letter, given the complexity of
> these modules and as they become more common, it is likely that we will
> need to extend the API to control more parameters and expose more
> diagnostic information. I would really like to keep it clean and
> contained in 'ETHTOOL_MSG_MODULE_*' messages and not spread it over
> different APIs.
> 
>>
>> Jake do you know what the use cases for Intel are? Are they SFP, MAC,
>> or NC-SI related?
> 
> I went through all the Intel drivers that implement these operations and
> I believe you are talking about these commits:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c3880bd159d431d06b687b0b5ab22e24e6ef0070
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d5ec9e2ce41ac198de2ee18e0e529b7ebbc67408
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ab4ab73fc1ec6dec548fa36c5e383ef5faa7b4c1
> 
> There isn't too much information about the motivation, but maybe it has
> something to do with multi-host controllers where you want to prevent
> one host from taking the physical link down for all the other hosts
> sharing it? I remember such issues with mlx5.
> 

Ok, I found some more information here. The primary motivation of the
changes in the i40e and ice drivers is from customer requests asking to
have the link go down when the port is administratively disabled. This
is because if the link is down then the switch on the other side will
see the port not having link and will stop trying to send traffic to it.

As far as I can tell, the reason its a flag is because some users wanted
the behavior the other way.

I'm not sure it's really related to the behavior here.

For what it's worth, I'm in favor of containing things like this into
ethtool as well.

Thanks,
Jake

  reply	other threads:[~2021-08-10 22:00 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-09 10:21 [RFC PATCH net-next 0/8] ethtool: Add ability to control transceiver modules Ido Schimmel
2021-08-09 10:21 ` [RFC PATCH net-next 1/8] ethtool: Add ability to control transceiver modules' low power mode Ido Schimmel
2021-08-09 14:28   ` Andrew Lunn
2021-08-10  7:26     ` Ido Schimmel
2021-08-10 13:52       ` Andrew Lunn
2021-08-10 13:59         ` Jakub Kicinski
2021-08-10 20:46           ` Ido Schimmel
2021-08-10 22:00             ` Keller, Jacob E [this message]
2021-08-10 22:06               ` Jakub Kicinski
2021-08-10 22:18                 ` Keller, Jacob E
2021-08-10 22:24                 ` Keller, Jacob E
2021-08-10 22:31                 ` Andrew Lunn
2021-08-11  0:38                   ` Keller, Jacob E
2021-08-10 22:05             ` Jakub Kicinski
2021-08-10 22:51               ` Andrew Lunn
2021-08-11 11:33               ` Ido Schimmel
2021-08-11 13:03                 ` Jakub Kicinski
2021-08-11 14:36                   ` Andrew Lunn
2021-08-11 19:37                     ` Ido Schimmel
2021-08-11 20:30                       ` Jakub Kicinski
2021-08-11 20:57                         ` Andrew Lunn
2021-08-11 21:04                         ` Ido Schimmel
2021-08-11 20:42                       ` Andrew Lunn
2021-08-10 21:38           ` Keller, Jacob E
2021-08-09 10:21 ` [RFC PATCH net-next 2/8] ethtool: Add ability to reset transceiver modules Ido Schimmel
2021-08-09 19:13   ` Andrew Lunn
2021-08-10 13:05     ` Ido Schimmel
2021-08-10 13:54       ` Jakub Kicinski
2021-08-10 18:15         ` Ido Schimmel
2021-08-10 18:58           ` Andrew Lunn
2021-08-10 19:00           ` Jakub Kicinski
2021-08-10 19:28             ` Andrew Lunn
2021-08-10 20:50               ` Ido Schimmel
2021-08-09 10:21 ` [RFC PATCH net-next 3/8] mlxsw: reg: Add fields to PMAOS register Ido Schimmel
2021-08-09 10:21 ` [RFC PATCH net-next 4/8] mlxsw: Make PMAOS pack function more generic Ido Schimmel
2021-08-09 10:21 ` [RFC PATCH net-next 5/8] mlxsw: reg: Add Port Module Memory Map Properties register Ido Schimmel
2021-08-09 10:21 ` [RFC PATCH net-next 6/8] mlxsw: reg: Add Management Cable IO and Notifications register Ido Schimmel
2021-08-09 10:21 ` [RFC PATCH net-next 7/8] mlxsw: Add ability to control transceiver modules' low power mode Ido Schimmel
2021-08-09 10:21 ` [RFC PATCH net-next 8/8] mlxsw: Add ability to reset transceiver modules Ido Schimmel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=71a5bd72-2154-a796-37b7-f39afdf2e34d@intel.com \
    --to=jacob.e.keller@intel.com \
    --cc=andrew@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=idosch@idosch.org \
    --cc=idosch@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=mkubecek@suse.cz \
    --cc=mlxsw@nvidia.com \
    --cc=netdev@vger.kernel.org \
    --cc=pali@kernel.org \
    --cc=vadimp@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.