From: Keith Busch <kbusch@kernel.org>
To: Sagi Grimberg <sagi@grimberg.me>
Cc: Edmund Nadolski <edmund.nadolski@intel.com>,
Christoph Hellwig <hch@lst.de>,
linux-nvme@lists.infradead.org
Subject: Re: [PATCH 2/5] nvme: Prevent resets during paused states
Date: Thu, 5 Sep 2019 14:35:46 -0600 [thread overview]
Message-ID: <20190905203546.GB25467@localhost.localdomain> (raw)
In-Reply-To: <5f36518c-7cf0-9fe1-49d7-2b24b3d229fe@grimberg.me>
On Thu, Sep 05, 2019 at 01:23:53PM -0700, Sagi Grimberg wrote:
>
> > A paused controller is doing critical internal activation work. Don't
> > allow a reset to occur by setting it to the resetting state, preventing
> > any future reset from occuring during this time.
>
> Is there a reproducible bug actually being addressed here?
Yes, IO timeouts happen during CSTS.PP, which is normal, and esaclating
such errors to reset the controller while it is activating firmware is
not a good idea.
Further, we do not want to a user to manaully trigger a reset (via sysfs
or other means), so this properly blocks such actions.
> Also, seems a bit "acrobatic" to set the state to RESETTING without
> really resetting it (and then change it back to LIVE before you do
> actually resetting it).
We can think of a CSTS.PP as the device internally resetting itself to
activate firmware.
> Would it make sense to look at nvme_ctrl_pp_status when
> scheduling a reset in nvme_reset_ctrl? Just a thought..
We have to be able to reset if we decide CSTS.PP is stuck, fw activation
timeout.
> > Signed-off-by: Keith Busch <kbusch@kernel.org>
> > ---
> > drivers/nvme/host/core.c | 9 ++++++---
> > 1 file changed, 6 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> > index 91b1f0e57715..d42167d7594b 100644
> > --- a/drivers/nvme/host/core.c
> > +++ b/drivers/nvme/host/core.c
> > @@ -3705,20 +3705,23 @@ static void nvme_fw_act_work(struct work_struct *work)
> > fw_act_timeout = jiffies +
> > msecs_to_jiffies(admin_timeout * 1000);
> > + if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_RESETTING))
> > + return;
> > +
> > nvme_stop_queues(ctrl);
> > while (nvme_ctrl_pp_status(ctrl)) {
> > if (time_after(jiffies, fw_act_timeout)) {
> > dev_warn(ctrl->device,
> > "Fw activation timeout, reset controller\n");
>
> Would be good if the print will reflect if it resetting or not..
>
> > - nvme_reset_ctrl(ctrl);
> > + if (nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE))
> > + nvme_reset_ctrl(ctrl);
>
> How can this state change not succeed? ctrl removal?
Right, we can't prevent a transition to a deleting state.
> > break;
> > }
> > msleep(100);
> > }
> > - if (ctrl->state != NVME_CTRL_LIVE)
> > + if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE))
> > return;
>
> In what scenario this will not succeed? if the reset did it?
Controller deletion should be the only reason here.
I see now the "break" for a failed activation ought to be a return,
so I can fix that if you're okay with the rest.
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
next prev parent reply other threads:[~2019-09-05 20:37 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-09-05 14:26 [PATCH 1/5] nvme: Restart request timers in resetting state Keith Busch
2019-09-05 14:26 ` [PATCH 2/5] nvme: Prevent resets during paused states Keith Busch
2019-09-05 20:23 ` Sagi Grimberg
2019-09-05 20:35 ` Keith Busch [this message]
2019-09-05 20:42 ` Sagi Grimberg
2019-09-05 14:26 ` [PATCH 3/5] nvme-pci: Free tagset if no IO queues Keith Busch
2019-09-05 20:24 ` Sagi Grimberg
2019-09-05 20:40 ` Keith Busch
2019-09-05 20:43 ` Sagi Grimberg
2019-09-05 14:26 ` [PATCH 4/5] nvme: Remove ADMIN_ONLY state Keith Busch
2019-09-05 14:26 ` [PATCH 5/5] nvme: Wait for reset state when required Keith Busch
2019-09-05 15:57 ` James Smart
2019-09-05 20:47 ` Sagi Grimberg
2019-09-05 20:55 ` Keith Busch
2019-09-05 20:13 ` [PATCH 1/5] nvme: Restart request timers in resetting state Sagi Grimberg
2019-09-05 20:25 ` Keith Busch
2019-09-05 20:39 ` Sagi Grimberg
2019-09-05 21:36 ` James Smart
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190905203546.GB25467@localhost.localdomain \
--to=kbusch@kernel.org \
--cc=edmund.nadolski@intel.com \
--cc=hch@lst.de \
--cc=linux-nvme@lists.infradead.org \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).