linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Keith Busch <kbusch@kernel.org>
To: Sagi Grimberg <sagi@grimberg.me>
Cc: Thorsten Leemhuis <regressions@leemhuis.info>,
	Christoph Hellwig <hch@lst.de>,
	lixingyuan@inspur.com, linux-nvme@lists.infradead.org
Subject: Re: [Regression] Bug 216400 - Firmware activation starting AEN processing prevents further AER commands sent to the NVMe controller.
Date: Mon, 29 Aug 2022 10:29:26 -0600	[thread overview]
Message-ID: <YwzpZpBgXM/U9GEP@kbusch-mbp.dhcp.thefacebook.com> (raw)
In-Reply-To: <dfe23fad-671f-f3fc-4a99-52690b2c1bf2@grimberg.me>

On Mon, Aug 29, 2022 at 12:14:21PM +0300, Sagi Grimberg wrote:
> 
> 
> On 8/26/22 15:19, Thorsten Leemhuis wrote:
> > Hi, this is your Linux kernel regression tracker.
> > 
> > I noticed a regression report in bugzilla.kernel.org that afaics nobody
> > acted upon since it was reported. That's why I decided to forward it by
> > mail to those that afaics should handle this.
> > 
> > To quote from https://bugzilla.kernel.org/show_bug.cgi?id=216400 :
> > 
> > >   lixingyuan 2022-08-23 01:14:50 UTC
> > > 
> > > This bug is related to these two commits:
> > > 
> > > 1. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.0-rc2&id=4c75f877853cfa81b12374a07208e07b077f39b8
> > > 
> > > These codes will set the controller state to NVME_CTRL_RESETTING while handling the firmware activation staring AEN
> > > 
> > > 2. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.0-rc2&id=0fa0f99fc84e41057cbdd2efbfe91c6b2f47dd9d
> > > 
> > > When submitting a new AER command to the controller, this code checks if the controller state is NVME_CTRL_LIVE. This caused the problem. When the firmware activation staring AEN was processed before, the controller state was already set to NVME_CTRL_RESETTING, which resulted in no new AER commands being sent to the controller.
> 
> I see.
> 
> I can modify this code to check in the drivers instead of the core.
> 
> Keith, pci does not risk submitting an async event on a freed admin
> queue? if not, I can add a proper check there as well...

I don't think we'd attempt to issue an admin command while the queue is down,
at least not in pci driver.

I think it should be sufficient to requeue the ctrl->async_event_work after the
activation work complete, no?


  reply	other threads:[~2022-08-29 16:42 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-26 12:19 [Regression] Bug 216400 - Firmware activation starting AEN processing prevents further AER commands sent to the NVMe controller Thorsten Leemhuis
2022-08-29  9:14 ` Sagi Grimberg
2022-08-29 16:29   ` Keith Busch [this message]
2022-08-30  7:03     ` Sagi Grimberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YwzpZpBgXM/U9GEP@kbusch-mbp.dhcp.thefacebook.com \
    --to=kbusch@kernel.org \
    --cc=hch@lst.de \
    --cc=linux-nvme@lists.infradead.org \
    --cc=lixingyuan@inspur.com \
    --cc=regressions@leemhuis.info \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).