linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: Jiri Pirko <jiri@resnulli.us>
Cc: Vasundhara Volam <vasundhara-v.volam@broadcom.com>,
	Moshe Shemesh <moshe@mellanox.com>,
	"David S. Miller" <davem@davemloft.net>,
	Jiri Pirko <jiri@mellanox.com>, Netdev <netdev@vger.kernel.org>,
	open list <linux-kernel@vger.kernel.org>,
	Michael Chan <michael.chan@broadcom.com>
Subject: Re: [PATCH net-next RFC v4 01/15] devlink: Add reload action option to devlink reload command
Date: Mon, 14 Sep 2020 14:31:00 -0700	[thread overview]
Message-ID: <20200914143100.06a4641d@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com> (raw)
In-Reply-To: <20200914112829.GC2236@nanopsycho.orion>

On Mon, 14 Sep 2020 13:28:29 +0200 Jiri Pirko wrote:
> Mon, Sep 14, 2020 at 11:54:55AM CEST, vasundhara-v.volam@broadcom.com wrote:
> >On Mon, Sep 14, 2020 at 3:02 PM Jiri Pirko <jiri@resnulli.us> wrote:  
> >> >> +mlxsw_devlink_core_bus_device_reload_up(struct devlink *devlink, enum devlink_reload_action action,
> >> >> +                                       struct netlink_ext_ack *extack,
> >> >> +                                       unsigned long *actions_performed)  
> >> >Sorry for repeating again, for fw_activate action on our device, all
> >> >the driver entities undergo reset asynchronously once user initiates
> >> >"devlink dev reload action fw_activate" and reload_up does not have
> >> >much to do except reporting actions that will be/being performed.
> >> >
> >> >Once reset is complete, the health reporter will be notified using  
> >>
> >> Hmm, how is the fw reset related to health reporter recovery? Recovery
> >> happens after some error event. I don't believe it is wise to mix it.  
> >Our device has a fw_reset health reporter, which is updated on reset
> >events and firmware activation is one among them. All non-fatal
> >firmware reset events are reported on fw_reset health reporter.  
> 
> Hmm, interesting. In that case, assuming this is fine, should we have
> some standard in this. I mean, if the driver supports reset, should it
> also define the "fw_reset" reporter to report such events?
> 
> Jakub, what is your take here?

Sounds doubly wrong to me.

As you say health reporters should trigger on error events,
communicating completion of an action requested by the user
seems very wrong. IIUC operators should monitor and collect
health failures. In this case looks like all events from fw_reset 
would need to be discarded, since they are not meaningful
without the context of what triggered them.

And secondly, reporting the completion via some async mechanism
that user has to monitor is just plain lazy. That's pushing out
the work that has to be done out to user space. Wait for the 
completion in the driver.

> >> Instead, why don't you block in reload_up() until the reset is complete?  
> >
> >Though user initiate "devlink dev reload" event on a single interface,
> >all driver entities undergo reset and all entities recover
> >independently. I don't think we can block the reload_up() on the
> >interface(that user initiated the command), until whole reset is
> >complete.  
> 
> Why not? mlxsw reset takes up to like 10 seconds for example.

+1, why?

  reply	other threads:[~2020-09-14 21:31 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-14  6:07 [PATCH net-next RFC v4 00/15] Add devlink reload action and Moshe Shemesh
2020-09-14  6:07 ` [PATCH net-next RFC v4 01/15] devlink: Add reload action option to devlink reload command Moshe Shemesh
2020-09-14  7:08   ` Vasundhara Volam
2020-09-14  9:32     ` Jiri Pirko
2020-09-14  9:54       ` Vasundhara Volam
2020-09-14 11:28         ` Jiri Pirko
2020-09-14 21:31           ` Jakub Kicinski [this message]
2020-09-14 22:06             ` Michael Chan
2020-09-15  6:18               ` Jiri Pirko
2020-09-14 12:27   ` Jiri Pirko
2020-09-15 12:12     ` Moshe Shemesh
2020-09-15 13:26       ` Jiri Pirko
2020-09-15 20:06         ` Moshe Shemesh
2020-09-14 21:33   ` Jakub Kicinski
2020-09-15 12:56     ` Moshe Shemesh
2020-09-15 13:26       ` Jiri Pirko
2020-09-15 16:00       ` Jakub Kicinski
2020-09-14  6:07 ` [PATCH net-next RFC v4 02/15] devlink: Add reload action limit level Moshe Shemesh
2020-09-14 13:10   ` Jiri Pirko
2020-09-15 12:15     ` Moshe Shemesh
2020-09-14  6:07 ` [PATCH net-next RFC v4 03/15] devlink: Add reload action stats Moshe Shemesh
2020-09-14 13:39   ` Jiri Pirko
2020-09-15 12:30     ` Moshe Shemesh
2020-09-15 13:33       ` Jiri Pirko
2020-09-15 20:20         ` Moshe Shemesh
2020-09-16  6:07           ` Jiri Pirko
2020-09-14  6:07 ` [PATCH net-next RFC v4 04/15] devlink: Add reload actions stats to dev get Moshe Shemesh
2020-09-14 13:45   ` Jiri Pirko
2020-09-15  6:45     ` Ido Schimmel
2020-09-15  7:44       ` Jiri Pirko
2020-09-15 12:31         ` Moshe Shemesh
2020-09-15 13:34           ` Jiri Pirko
2020-09-15 20:33             ` Moshe Shemesh
2020-09-18 16:13               ` Moshe Shemesh
2020-09-21 10:33                 ` Jiri Pirko
2020-09-14  6:07 ` [PATCH net-next RFC v4 05/15] net/mlx5: Add functions to set/query MFRL register Moshe Shemesh
2020-09-14  6:07 ` [PATCH net-next RFC v4 06/15] net/mlx5: Set cap for pci sync for fw update event Moshe Shemesh
2020-09-14  6:07 ` [PATCH net-next RFC v4 07/15] net/mlx5: Handle sync reset request event Moshe Shemesh
2020-09-14  6:07 ` [PATCH net-next RFC v4 08/15] net/mlx5: Handle sync reset now event Moshe Shemesh
2020-09-14  6:07 ` [PATCH net-next RFC v4 09/15] net/mlx5: Handle sync reset abort event Moshe Shemesh
2020-09-14  6:07 ` [PATCH net-next RFC v4 10/15] net/mlx5: Add support for devlink reload action fw activate Moshe Shemesh
2020-09-14 13:52   ` Jiri Pirko
2020-09-15 12:38     ` Moshe Shemesh
2020-09-14 13:54   ` Jiri Pirko
2020-09-15 12:44     ` Moshe Shemesh
2020-09-15 13:37       ` Jiri Pirko
2020-09-15 20:28         ` Moshe Shemesh
2020-09-16  6:08           ` Jiri Pirko
2020-09-14  6:07 ` [PATCH net-next RFC v4 11/15] devlink: Add enable_remote_dev_reset generic parameter Moshe Shemesh
2020-09-14 14:12   ` Jiri Pirko
2020-09-14  6:07 ` [PATCH net-next RFC v4 12/15] net/mlx5: Add devlink param enable_remote_dev_reset support Moshe Shemesh
2020-09-14  6:08 ` [PATCH net-next RFC v4 13/15] net/mlx5: Add support for fw live patch event Moshe Shemesh
2020-09-14  6:08 ` [PATCH net-next RFC v4 14/15] net/mlx5: Add support for devlink reload action limit level no reset Moshe Shemesh
2020-09-14  6:08 ` [PATCH net-next RFC v4 15/15] devlink: Add Documentation/networking/devlink/devlink-reload.rst Moshe Shemesh
2020-09-14 11:43   ` Jiri Pirko
2020-09-15 16:04   ` Jakub Kicinski
2020-09-15 19:59     ` Moshe Shemesh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200914143100.06a4641d@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com \
    --to=kuba@kernel.org \
    --cc=davem@davemloft.net \
    --cc=jiri@mellanox.com \
    --cc=jiri@resnulli.us \
    --cc=linux-kernel@vger.kernel.org \
    --cc=michael.chan@broadcom.com \
    --cc=moshe@mellanox.com \
    --cc=netdev@vger.kernel.org \
    --cc=vasundhara-v.volam@broadcom.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).