Linux-NVME Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v3] nvme: retain split access workaround for capability reads
@ 2019-10-07 11:42 Ard Biesheuvel
  2019-10-07 12:07 ` Christoph Hellwig
  0 siblings, 1 reply; 10+ messages in thread
From: Ard Biesheuvel @ 2019-10-07 11:42 UTC (permalink / raw)
  To: linux-nvme; +Cc: sagi, Ard Biesheuvel, ilias.apalodimas, axboe, kbusch, hch

Commit 7fd8930f26be4

  "nvme: add a common helper to read Identify Controller data"

has re-introduced an issue that we have attempted to work around in the
past, in commit a310acd7a7ea ("NVMe: use split lo_hi_{read,write}q").

The problem is that some PCIe NVMe controllers do not implement 64-bit
outbound accesses correctly, which is why the commit above switched
to using lo_hi_[read|write]q for all 64-bit BAR accesses occuring in
the code.

In the mean time, the NVMe subsystem has been refactored, and now calls
into the PCIe support layer for NVMe via a .reg_read64() method, which
fails to use lo_hi_readq(), and thus reintroduces the problem that the
workaround above aimed to address.

Given that, at the moment, .reg_read64() is only used to read the
capability register [which is known to tolerate split reads], let's
switch .reg_read64() to lo_hi_readq() as well.

This fixes a boot issue on some ARM boxes with NVMe behind a Synopsys
DesignWare PCIe host controller.

Fixes: 7fd8930f26be4 ("nvme: add a common helper to read Identify Controller data")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
v3: add comment explaining why we need the split read
v2: drop WARN_ON(), add reference to commit that [re-]introduced the issue

 drivers/nvme/host/pci.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index bb88681f4dc3..f908acc5d2ef 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2672,7 +2672,8 @@ static int nvme_pci_reg_write32(struct nvme_ctrl *ctrl, u32 off, u32 val)
 
 static int nvme_pci_reg_read64(struct nvme_ctrl *ctrl, u32 off, u64 *val)
 {
-	*val = readq(to_nvme_dev(ctrl)->bar + off);
+	/* use a split read to work around buggy interconnects */
+	*val = lo_hi_readq(to_nvme_dev(ctrl)->bar + off);
 	return 0;
 }
 
-- 
2.20.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3] nvme: retain split access workaround for capability reads
  2019-10-07 11:42 [PATCH v3] nvme: retain split access workaround for capability reads Ard Biesheuvel
@ 2019-10-07 12:07 ` Christoph Hellwig
  2019-10-07 12:24   ` Ard Biesheuvel
  0 siblings, 1 reply; 10+ messages in thread
From: Christoph Hellwig @ 2019-10-07 12:07 UTC (permalink / raw)
  To: Ard Biesheuvel; +Cc: sagi, ilias.apalodimas, linux-nvme, axboe, kbusch, hch

On Mon, Oct 07, 2019 at 01:42:53PM +0200, Ard Biesheuvel wrote:
> This fixes a boot issue on some ARM boxes with NVMe behind a Synopsys
> DesignWare PCIe host controller.

>  static int nvme_pci_reg_read64(struct nvme_ctrl *ctrl, u32 off, u64 *val)
>  {
> -	*val = readq(to_nvme_dev(ctrl)->bar + off);
> +	/* use a split read to work around buggy interconnects */
> +	*val = lo_hi_readq(to_nvme_dev(ctrl)->bar + off);

I though this was to fix back up the broken Apple controllers?

If you interconnect doesn't support 8-byte MMIO read/write TLPs you
have a much deeper problem, as this will break all drivers using
readq/writeq.  And we currently only have compile time detection for
readq/writeq, not runtime so you'll have to invent a scheme if this
works at all or not.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3] nvme: retain split access workaround for capability reads
  2019-10-07 12:07 ` Christoph Hellwig
@ 2019-10-07 12:24   ` Ard Biesheuvel
  2019-10-07 12:27     ` Christoph Hellwig
  0 siblings, 1 reply; 10+ messages in thread
From: Ard Biesheuvel @ 2019-10-07 12:24 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: axboe, Keith Busch, Ilias Apalodimas, sagi, linux-nvme

On Mon, 7 Oct 2019 at 14:07, Christoph Hellwig <hch@lst.de> wrote:
>
> On Mon, Oct 07, 2019 at 01:42:53PM +0200, Ard Biesheuvel wrote:
> > This fixes a boot issue on some ARM boxes with NVMe behind a Synopsys
> > DesignWare PCIe host controller.
>
> >  static int nvme_pci_reg_read64(struct nvme_ctrl *ctrl, u32 off, u64 *val)
> >  {
> > -     *val = readq(to_nvme_dev(ctrl)->bar + off);
> > +     /* use a split read to work around buggy interconnects */
> > +     *val = lo_hi_readq(to_nvme_dev(ctrl)->bar + off);
>
> I though this was to fix back up the broken Apple controllers?
>

Yes, and other hardware suffering from the same issue.

> If you interconnect doesn't support 8-byte MMIO read/write TLPs you
> have a much deeper problem, as this will break all drivers using
> readq/writeq.  And we currently only have compile time detection for
> readq/writeq, not runtime so you'll have to invent a scheme if this
> works at all or not.

Sure. But the practical reality is that the hardware in question
(including the Apple controller) worked perfectly fine until commit
7fd8930f26be4 introduced a readq() call into a file that had
deliberately been switched to using lo_hi_readq() because readq()
doesn't work reliably for all hardware we would like it to support.
Theorizing about *why* readq() doesn't work reliably in which
particular case doesn't seem that useful to me, given how trivial the
fix is.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3] nvme: retain split access workaround for capability reads
  2019-10-07 12:24   ` Ard Biesheuvel
@ 2019-10-07 12:27     ` Christoph Hellwig
  2019-10-07 12:32       ` Ard Biesheuvel
  0 siblings, 1 reply; 10+ messages in thread
From: Christoph Hellwig @ 2019-10-07 12:27 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: sagi, Ilias Apalodimas, linux-nvme, axboe, Keith Busch,
	Christoph Hellwig

On Mon, Oct 07, 2019 at 02:24:58PM +0200, Ard Biesheuvel wrote:
> > If you interconnect doesn't support 8-byte MMIO read/write TLPs you
> > have a much deeper problem, as this will break all drivers using
> > readq/writeq.  And we currently only have compile time detection for
> > readq/writeq, not runtime so you'll have to invent a scheme if this
> > works at all or not.
> 
> Sure. But the practical reality is that the hardware in question
> (including the Apple controller) worked perfectly fine until commit
> 7fd8930f26be4 introduced a readq() call into a file that had
> deliberately been switched to using lo_hi_readq() because readq()
> doesn't work reliably for all hardware we would like it to support.
> Theorizing about *why* readq() doesn't work reliably in which
> particular case doesn't seem that useful to me, given how trivial the
> fix is.

My point here is that if it isn't the PCIe device that is broken like
in the apple case, but your interconnect you have a problem that can't
be fixed just in the nvme driver.  We have tons of other drivers relying
in readq/writeq working if it is available.  You'll need to find a more
general workaround, independent of the fact that we have a few NVMe
controllers that always need this workaround.  And at least for NVMe
the spec specically allows split 32-bit access at least.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3] nvme: retain split access workaround for capability reads
  2019-10-07 12:27     ` Christoph Hellwig
@ 2019-10-07 12:32       ` Ard Biesheuvel
  2019-10-07 12:47         ` Christoph Hellwig
  2019-10-07 12:48         ` Keith Busch
  0 siblings, 2 replies; 10+ messages in thread
From: Ard Biesheuvel @ 2019-10-07 12:32 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: axboe, Keith Busch, Ilias Apalodimas, sagi, linux-nvme

On Mon, 7 Oct 2019 at 14:27, Christoph Hellwig <hch@lst.de> wrote:
>
> On Mon, Oct 07, 2019 at 02:24:58PM +0200, Ard Biesheuvel wrote:
> > > If you interconnect doesn't support 8-byte MMIO read/write TLPs you
> > > have a much deeper problem, as this will break all drivers using
> > > readq/writeq.  And we currently only have compile time detection for
> > > readq/writeq, not runtime so you'll have to invent a scheme if this
> > > works at all or not.
> >
> > Sure. But the practical reality is that the hardware in question
> > (including the Apple controller) worked perfectly fine until commit
> > 7fd8930f26be4 introduced a readq() call into a file that had
> > deliberately been switched to using lo_hi_readq() because readq()
> > doesn't work reliably for all hardware we would like it to support.
> > Theorizing about *why* readq() doesn't work reliably in which
> > particular case doesn't seem that useful to me, given how trivial the
> > fix is.
>
> My point here is that if it isn't the PCIe device that is broken like
> in the apple case, but your interconnect you have a problem that can't
> be fixed just in the nvme driver.  We have tons of other drivers relying
> in readq/writeq working if it is available.  You'll need to find a more
> general workaround, independent of the fact that we have a few NVMe
> controllers that always need this workaround.  And at least for NVMe
> the spec specically allows split 32-bit access at least.

OK, that is good to know. Mind you, I used 'interconnect' in the
abstract sense, meaning whatever sits between the CPU doing the read
and the 64-bit register in the BAR space.

But I fail to see your point. Why is it relevant for deciding whether
to apply a NVMe fix if the affected hardware can or cannot use other
types of PCIe devices? Note that I am not proposing some hacky
workaround to be applied, but just to stick with the workaround that
was already accepted (and I'm pretty sure that this Apple hardware got
broken too with commit 7fd8930f26be4)

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3] nvme: retain split access workaround for capability reads
  2019-10-07 12:32       ` Ard Biesheuvel
@ 2019-10-07 12:47         ` Christoph Hellwig
  2019-10-07 12:48         ` Keith Busch
  1 sibling, 0 replies; 10+ messages in thread
From: Christoph Hellwig @ 2019-10-07 12:47 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: sagi, Ilias Apalodimas, linux-nvme, axboe, Keith Busch,
	Christoph Hellwig

On Mon, Oct 07, 2019 at 02:32:43PM +0200, Ard Biesheuvel wrote:
> OK, that is good to know. Mind you, I used 'interconnect' in the
> abstract sense, meaning whatever sits between the CPU doing the read
> and the 64-bit register in the BAR space.
> 
> But I fail to see your point. Why is it relevant for deciding whether
> to apply a NVMe fix if the affected hardware can or cannot use other
> types of PCIe devices? Note that I am not proposing some hacky
> workaround to be applied, but just to stick with the workaround that
> was already accepted (and I'm pretty sure that this Apple hardware got
> broken too with commit 7fd8930f26be4)

That isn't the point.  I'm fine with the technical changes, but the
commit log and the comment seem to imply that this fixes your
interconnect issue.  It does not - at best it works around it for this
particular device, and even that only as a side effet for fixing a device
side bug.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3] nvme: retain split access workaround for capability reads
  2019-10-07 12:32       ` Ard Biesheuvel
  2019-10-07 12:47         ` Christoph Hellwig
@ 2019-10-07 12:48         ` Keith Busch
  2019-10-07 13:20           ` Ard Biesheuvel
  1 sibling, 1 reply; 10+ messages in thread
From: Keith Busch @ 2019-10-07 12:48 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: axboe, Ilias Apalodimas, Christoph Hellwig, linux-nvme, sagi

On Mon, Oct 07, 2019 at 02:32:43PM +0200, Ard Biesheuvel wrote:
> On Mon, 7 Oct 2019 at 14:27, Christoph Hellwig <hch@lst.de> wrote:
> >
> > On Mon, Oct 07, 2019 at 02:24:58PM +0200, Ard Biesheuvel wrote:
> > > > If you interconnect doesn't support 8-byte MMIO read/write TLPs you
> > > > have a much deeper problem, as this will break all drivers using
> > > > readq/writeq.  And we currently only have compile time detection for
> > > > readq/writeq, not runtime so you'll have to invent a scheme if this
> > > > works at all or not.
> > >
> > > Sure. But the practical reality is that the hardware in question
> > > (including the Apple controller) worked perfectly fine until commit
> > > 7fd8930f26be4 introduced a readq() call into a file that had
> > > deliberately been switched to using lo_hi_readq() because readq()
> > > doesn't work reliably for all hardware we would like it to support.
> > > Theorizing about *why* readq() doesn't work reliably in which
> > > particular case doesn't seem that useful to me, given how trivial the
> > > fix is.
> >
> > My point here is that if it isn't the PCIe device that is broken like
> > in the apple case, but your interconnect you have a problem that can't
> > be fixed just in the nvme driver.  We have tons of other drivers relying
> > in readq/writeq working if it is available.  You'll need to find a more
> > general workaround, independent of the fact that we have a few NVMe
> > controllers that always need this workaround.  And at least for NVMe
> > the spec specically allows split 32-bit access at least.
> 
> OK, that is good to know. Mind you, I used 'interconnect' in the
> abstract sense, meaning whatever sits between the CPU doing the read
> and the 64-bit register in the BAR space.
> 
> But I fail to see your point. Why is it relevant for deciding whether
> to apply a NVMe fix if the affected hardware can or cannot use other
> types of PCIe devices? Note that I am not proposing some hacky
> workaround to be applied, but just to stick with the workaround that
> was already accepted (and I'm pretty sure that this Apple hardware got
> broken too with commit 7fd8930f26be4)

The point is the reasoning in the changelog does not justify *this* patch. If
you change the wording to not mention your host controller, and instead just
refer to the previous NVMe behavior (and modify your comment accordingly), then
we should be fine.

If you explain this patch by saying it's to fix a host controller, then the
patch is no longer sufficient on it's own and should be fixed elsewhere,
perhaps by providing a special pci_ops structure for your controller.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3] nvme: retain split access workaround for capability reads
  2019-10-07 12:48         ` Keith Busch
@ 2019-10-07 13:20           ` Ard Biesheuvel
  2019-10-07 13:32             ` Keith Busch
  0 siblings, 1 reply; 10+ messages in thread
From: Ard Biesheuvel @ 2019-10-07 13:20 UTC (permalink / raw)
  To: Keith Busch; +Cc: axboe, Ilias Apalodimas, Christoph Hellwig, linux-nvme, sagi

On Mon, 7 Oct 2019 at 14:48, Keith Busch <kbusch@kernel.org> wrote:
>
> On Mon, Oct 07, 2019 at 02:32:43PM +0200, Ard Biesheuvel wrote:
> > On Mon, 7 Oct 2019 at 14:27, Christoph Hellwig <hch@lst.de> wrote:
> > >
> > > On Mon, Oct 07, 2019 at 02:24:58PM +0200, Ard Biesheuvel wrote:
> > > > > If you interconnect doesn't support 8-byte MMIO read/write TLPs you
> > > > > have a much deeper problem, as this will break all drivers using
> > > > > readq/writeq.  And we currently only have compile time detection for
> > > > > readq/writeq, not runtime so you'll have to invent a scheme if this
> > > > > works at all or not.
> > > >
> > > > Sure. But the practical reality is that the hardware in question
> > > > (including the Apple controller) worked perfectly fine until commit
> > > > 7fd8930f26be4 introduced a readq() call into a file that had
> > > > deliberately been switched to using lo_hi_readq() because readq()
> > > > doesn't work reliably for all hardware we would like it to support.
> > > > Theorizing about *why* readq() doesn't work reliably in which
> > > > particular case doesn't seem that useful to me, given how trivial the
> > > > fix is.
> > >
> > > My point here is that if it isn't the PCIe device that is broken like
> > > in the apple case, but your interconnect you have a problem that can't
> > > be fixed just in the nvme driver.  We have tons of other drivers relying
> > > in readq/writeq working if it is available.  You'll need to find a more
> > > general workaround, independent of the fact that we have a few NVMe
> > > controllers that always need this workaround.  And at least for NVMe
> > > the spec specically allows split 32-bit access at least.
> >
> > OK, that is good to know. Mind you, I used 'interconnect' in the
> > abstract sense, meaning whatever sits between the CPU doing the read
> > and the 64-bit register in the BAR space.
> >
> > But I fail to see your point. Why is it relevant for deciding whether
> > to apply a NVMe fix if the affected hardware can or cannot use other
> > types of PCIe devices? Note that I am not proposing some hacky
> > workaround to be applied, but just to stick with the workaround that
> > was already accepted (and I'm pretty sure that this Apple hardware got
> > broken too with commit 7fd8930f26be4)
>
> The point is the reasoning in the changelog does not justify *this* patch. If
> you change the wording to not mention your host controller, and instead just
> refer to the previous NVMe behavior (and modify your comment accordingly), then
> we should be fine.
>
> If you explain this patch by saying it's to fix a host controller, then the
> patch is no longer sufficient on it's own and should be fixed elsewhere,
> perhaps by providing a special pci_ops structure for your controller.

Fair enough. Any suggestions for the wording of the comment?

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3] nvme: retain split access workaround for capability reads
  2019-10-07 13:20           ` Ard Biesheuvel
@ 2019-10-07 13:32             ` Keith Busch
  2019-10-07 13:33               ` Ard Biesheuvel
  0 siblings, 1 reply; 10+ messages in thread
From: Keith Busch @ 2019-10-07 13:32 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: axboe, Ilias Apalodimas, Christoph Hellwig, linux-nvme, sagi

On Mon, Oct 07, 2019 at 03:20:11PM +0200, Ard Biesheuvel wrote:
> On Mon, 7 Oct 2019 at 14:48, Keith Busch <kbusch@kernel.org> wrote:
> > If you explain this patch by saying it's to fix a host controller, then the
> > patch is no longer sufficient on it's own and should be fixed elsewhere,
> > perhaps by providing a special pci_ops structure for your controller.
> 
> Fair enough. Any suggestions for the wording of the comment?

Something like this for the changelog:

  Subject: nvme/pci: Split 8-byte reads

  The nvme pci driver had split 8-byte register reads using lo_hi_readq() due to
  nvme controllers that do not support that sized access. This behavior was
  inadvertently changed to readq(), which may break those controllers. Restore
  the previous behavior.

  Fixes: 7fd8930f26be4 ("nvme: add a common helper to read Identify Controller data")

And the comment:

	/*
	 * Split the 8-byte read into two 4-byte reads since all controllers
	 * support 4 byte register reads, but some do not support the larger
	 * access size.
	 */

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3] nvme: retain split access workaround for capability reads
  2019-10-07 13:32             ` Keith Busch
@ 2019-10-07 13:33               ` Ard Biesheuvel
  0 siblings, 0 replies; 10+ messages in thread
From: Ard Biesheuvel @ 2019-10-07 13:33 UTC (permalink / raw)
  To: Keith Busch; +Cc: axboe, Ilias Apalodimas, Christoph Hellwig, linux-nvme, sagi

On Mon, 7 Oct 2019 at 15:32, Keith Busch <kbusch@kernel.org> wrote:
>
> On Mon, Oct 07, 2019 at 03:20:11PM +0200, Ard Biesheuvel wrote:
> > On Mon, 7 Oct 2019 at 14:48, Keith Busch <kbusch@kernel.org> wrote:
> > > If you explain this patch by saying it's to fix a host controller, then the
> > > patch is no longer sufficient on it's own and should be fixed elsewhere,
> > > perhaps by providing a special pci_ops structure for your controller.
> >
> > Fair enough. Any suggestions for the wording of the comment?
>
> Something like this for the changelog:
>
>   Subject: nvme/pci: Split 8-byte reads
>
>   The nvme pci driver had split 8-byte register reads using lo_hi_readq() due to
>   nvme controllers that do not support that sized access. This behavior was
>   inadvertently changed to readq(), which may break those controllers. Restore
>   the previous behavior.
>
>   Fixes: 7fd8930f26be4 ("nvme: add a common helper to read Identify Controller data")
>
> And the comment:
>
>         /*
>          * Split the 8-byte read into two 4-byte reads since all controllers
>          * support 4 byte register reads, but some do not support the larger
>          * access size.
>          */

Thanks Keith.

I'll respin with that.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, back to index

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-07 11:42 [PATCH v3] nvme: retain split access workaround for capability reads Ard Biesheuvel
2019-10-07 12:07 ` Christoph Hellwig
2019-10-07 12:24   ` Ard Biesheuvel
2019-10-07 12:27     ` Christoph Hellwig
2019-10-07 12:32       ` Ard Biesheuvel
2019-10-07 12:47         ` Christoph Hellwig
2019-10-07 12:48         ` Keith Busch
2019-10-07 13:20           ` Ard Biesheuvel
2019-10-07 13:32             ` Keith Busch
2019-10-07 13:33               ` Ard Biesheuvel

Linux-NVME Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-nvme/0 linux-nvme/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-nvme linux-nvme/ https://lore.kernel.org/linux-nvme \
		linux-nvme@lists.infradead.org
	public-inbox-index linux-nvme

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.infradead.lists.linux-nvme


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git