From: "Robert Straw" <drbawb@fatalsyntax.com>
To: "Christoph Hellwig" <hch@infradead.org>
Cc: "Bjorn Helgaas" <helgaas@kernel.org>, <bhelgaas@google.com>,
<linux-pci@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
"Alex Williamson" <alex.williamson@redhat.com>
Subject: Re: [PATCH] pci: add NVMe FLR quirk to the SM951 SSD
Date: Wed, 19 May 2021 07:54:19 -0500 [thread overview]
Message-ID: <CBH8K74TF8IQ.2KUOIGFJ7K8XP@nagato> (raw)
In-Reply-To: <YKTP2GQkLz5jma/q@infradead.org>
On Wed May 19, 2021 at 3:44 AM CDT, Christoph Hellwig wrote:
> On Sat, May 15, 2021 at 12:20:05PM -0500, Robert Straw wrote:
> While it doesn't matter here, NVMe 1.1 is very much out of data, being
> a more than 8 year old specification. The current version is 1.4b,
> with NVMe 2.0 about to be released.
I can't comment on 2.0, but yes 1.4b has the same aside regarding undefined
behavior on the SHST field (on p. 50). The only reason I was looking at
1.1a is because it's specifically listed on the datasheet for the SM951.
(The device under test.)
> No, we don't. This is a bug particular to a specific implementation.
> In fact the whole existing NVMe shutdown before reset quirk is rather
> broken and dangerous, as it concurrently accesses the NVMe registers
> with the actual driver, which could be trivially triggered through the
> sysfs reset attribute.
I'm not exactly clear in what way the nvme driver would be racing against
vfio-pci here. (a) vfio-pci is the driver bound in this scenario, and (b)
the vfio-pci driver triggers this quirk by issuing an FLR, which is done
with the device locked. (e.g: vfio_pci.c:499.)
In my testing *without this patch* vfio-pci is still bound to the device
for at least 60s after guest shutdown, at which point the FLR times out.
After this FLR the device is useless w/o a full reboot of the host.
Rebinding it to *either* another guest w/ vfio-pci, or the Linux nvme
driver doesn't matter: as the device can no longer be reconfigured.
As I understand it: vfio-pci should not blindly issue an FLR to an NVMe class
device w/o obeying the protocol. The protocol seems clear that after a
shutdown CC->EN must transition from 1 to 0. (I would argue the guest OS
leaving the device in this state is the actual violation of the spec. As
I'm unable to change that behavior: having vfio-pci clean up the mess w/
this quirk seems to be an adequate workaround.)
I am currently testing a version of this patch that only disables the
controller if the device has been previously shutdown. I am trying to
gauge whether this would be preferable to just blanket-disabling these
bugged devices before relinquishing control of them back to the host.
> I'd much rather quirk these broken Samsung drivers to not allow
> assigning them to VFIO.
I'd much rather keep using my storage devices. I will leave the
quirk limited to these known-bugged devices.
prev parent reply other threads:[~2021-05-19 14:01 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-29 23:07 [PATCH] pci: add NVMe FLR quirk to the SM951 SSD Robert Straw
2021-04-30 20:51 ` Bjorn Helgaas
2021-04-30 22:34 ` Robert Straw
2021-05-15 17:20 ` Robert Straw
2021-05-19 8:44 ` Christoph Hellwig
2021-05-19 12:54 ` Robert Straw [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CBH8K74TF8IQ.2KUOIGFJ7K8XP@nagato \
--to=drbawb@fatalsyntax.com \
--cc=alex.williamson@redhat.com \
--cc=bhelgaas@google.com \
--cc=hch@infradead.org \
--cc=helgaas@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).