linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] Urgent bug fix causing Apple SSDs to not work.
       [not found] <PNZPR01MB4415600ACD3C8D9944F15058B8A59@PNZPR01MB4415.INDPRD01.PROD.OUTLOOK.COM>
@ 2021-09-25 18:47 ` Linus Torvalds
  2021-09-25 19:54   ` Keith Busch
  0 siblings, 1 reply; 9+ messages in thread
From: Linus Torvalds @ 2021-09-25 18:47 UTC (permalink / raw)
  To: Aditya Garg
  Cc: kbusch, axboe, hch, sagi, linux-nvme, james.smart,
	chaitanya.kulkarni, akpm, linux-kernel, trivial

On Fri, Sep 24, 2021 at 9:02 PM Aditya Garg <gargaditya08@live.com> wrote:
>
> From: Aditya Garg <gargaditya08@live.com>
> Date: Fri, 24 Sep 2021 15:36:45 +0530
> Subject: [PATCH] Revert nvme to 5.14.5 to fix incompatibility arised in Apple SSDs.
> Fixes: e7006de6c238 (nvme: code command_id with a genctr for use-after-free validation)

I think we need to hear more about the problem than just revert a
commit like this randomly. That commit has already been picked up for
-stable,

What are the exact symptoms, and which Apple SSD is this?

I do find this:

  https://lore.kernel.org/all/cjJiSFV77WM51ciS8EuBcdeBcv9T83PUB-Kw3yi8PuC_LwrrUUnQ3w5RC1PbKvSYE72KryXp3wOJhv4Ov_WWIe2gKWOOo5uwuUjbbFA8HDM=@protonmail.com/

which instead of a revert has an actual patch. Can you try that one?

Keith Busch replied to that one, saying that the Apple SSD might not
be spec compliant, but hey, what else is new? If we start demanding
that hardware comply with specs, we'd have to scrap the whole notion
of working in the real world. Plus it would be very hypocritical of
us, since we ignore all specs when we deem them too limiting (whether
they be language specs, POSIX OS specs, or whatever).

           Linus

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Urgent bug fix causing Apple SSDs to not work.
  2021-09-25 18:47 ` [PATCH] Urgent bug fix causing Apple SSDs to not work Linus Torvalds
@ 2021-09-25 19:54   ` Keith Busch
  2021-09-25 20:08     ` Sven Peter
  2021-09-25 20:34     ` Sagi Grimberg
  0 siblings, 2 replies; 9+ messages in thread
From: Keith Busch @ 2021-09-25 19:54 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Aditya Garg, axboe, hch, sagi, linux-nvme, james.smart,
	chaitanya.kulkarni, akpm, linux-kernel, trivial

On Sat, Sep 25, 2021 at 11:47:08AM -0700, Linus Torvalds wrote:
> On Fri, Sep 24, 2021 at 9:02 PM Aditya Garg <gargaditya08@live.com> wrote:
> >
> > From: Aditya Garg <gargaditya08@live.com>
> > Date: Fri, 24 Sep 2021 15:36:45 +0530
> > Subject: [PATCH] Revert nvme to 5.14.5 to fix incompatibility arised in Apple SSDs.
> > Fixes: e7006de6c238 (nvme: code command_id with a genctr for use-after-free validation)
> 
> I think we need to hear more about the problem than just revert a
> commit like this randomly. That commit has already been picked up for
> -stable,
> 
> What are the exact symptoms, and which Apple SSD is this?
> 
> I do find this:
> 
>   https://lore.kernel.org/all/cjJiSFV77WM51ciS8EuBcdeBcv9T83PUB-Kw3yi8PuC_LwrrUUnQ3w5RC1PbKvSYE72KryXp3wOJhv4Ov_WWIe2gKWOOo5uwuUjbbFA8HDM=@protonmail.com/
> 
> which instead of a revert has an actual patch. Can you try that one?
> 
> Keith Busch replied to that one, saying that the Apple SSD might not
> be spec compliant, but hey, what else is new? If we start demanding
> that hardware comply with specs, we'd have to scrap the whole notion
> of working in the real world. Plus it would be very hypocritical of
> us, since we ignore all specs when we deem them too limiting (whether
> they be language specs, POSIX OS specs, or whatever).

Right, we have a lot of quirks for the apple controllers, what's one
more? :)

Could the following patch be tried? I'm basing this off the 'lspci' from
Orlando, but I'm assuming the previous model has the same limitation,
too.

---
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 7efb31b87f37..f0787233557f 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -979,6 +979,7 @@ EXPORT_SYMBOL_GPL(nvme_cleanup_cmd);
 blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req)
 {
 	struct nvme_command *cmd = nvme_req(req)->cmd;
+	struct nvme_ctrl *ctrl = nvme_req(req)->ctrl;
 	blk_status_t ret = BLK_STS_OK;
 
 	if (!(req->rq_flags & RQF_DONTPREP)) {
@@ -1027,7 +1028,8 @@ blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req)
 		return BLK_STS_IOERR;
 	}
 
-	nvme_req(req)->genctr++;
+	if (!(ctrl->quirks & NVME_QUIRK_SKIP_CID_GEN))
+		nvme_req(req)->genctr++;
 	cmd->common.command_id = nvme_cid(req);
 	trace_nvme_setup_cmd(req, cmd);
 	return ret;
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 9871c0c9374c..b49761d30df7 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -86,6 +86,12 @@ enum nvme_quirks {
 	 */
 	NVME_QUIRK_NO_DEEPEST_PS		= (1 << 5),
 
+	/*
+	 * The controller requires the command_id value be be limited to the
+	 * queue depth.
+	 */
+	NVME_QUIRK_SKIP_CID_GEN			= (1 << 6),
+
 	/*
 	 * Set MEDIUM priority on SQ creation
 	 */
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index b82492cd7503..d9f22ed68185 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -3369,7 +3369,10 @@ static const struct pci_device_id nvme_id_table[] = {
 	{ PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2005),
 		.driver_data = NVME_QUIRK_SINGLE_VECTOR |
 				NVME_QUIRK_128_BYTES_SQES |
-				NVME_QUIRK_SHARED_TAGS },
+				NVME_QUIRK_SHARED_TAGS ,
+				NVME_QUIRK_SKIP_CID_GEN },
+	{ PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2006),
+		.driver_data = NVME_QUIRK_SKIP_CID_GEN },
 
 	{ PCI_DEVICE_CLASS(PCI_CLASS_STORAGE_EXPRESS, 0xffffff) },
 	{ 0, }
--

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] Urgent bug fix causing Apple SSDs to not work.
  2021-09-25 19:54   ` Keith Busch
@ 2021-09-25 20:08     ` Sven Peter
  2021-09-25 20:38       ` Keith Busch
  2021-09-25 20:34     ` Sagi Grimberg
  1 sibling, 1 reply; 9+ messages in thread
From: Sven Peter @ 2021-09-25 20:08 UTC (permalink / raw)
  To: Keith Busch, Linus Torvalds
  Cc: Aditya Garg, axboe, hch, sagi, linux-nvme, james.smart,
	chaitanya.kulkarni, akpm, linux-kernel, trivial

Hi,

I actually ran into a similar issue while adding support for the NVMe
controller found on the M1 and assumed it was only present there.

Some background why this happens: ANS2 is a co-processor that emulates
an NVMe MMIO interface and uses the tag as an index to an internal data
structure. On the M1 we can directly talk to ANS2 and while we can submit
commands with a higher index it'll just ignore the upper bits and only
return the lowest eight IIRC in the completion queue.
I guess whatever software is running on the T2 actually has an assert to
ensure that the tag is within the limits before forwarding the command
to ANS2.

I haven't tested the attached patch yet but my WIP tree has an almost identical
one that works fine.



Best,

Sven


On Sat, Sep 25, 2021, at 21:54, Keith Busch wrote:
> On Sat, Sep 25, 2021 at 11:47:08AM -0700, Linus Torvalds wrote:
>> On Fri, Sep 24, 2021 at 9:02 PM Aditya Garg <gargaditya08@live.com> wrote:
>> >
>> > From: Aditya Garg <gargaditya08@live.com>
>> > Date: Fri, 24 Sep 2021 15:36:45 +0530
>> > Subject: [PATCH] Revert nvme to 5.14.5 to fix incompatibility arised in Apple SSDs.
>> > Fixes: e7006de6c238 (nvme: code command_id with a genctr for use-after-free validation)
>> 
>> I think we need to hear more about the problem than just revert a
>> commit like this randomly. That commit has already been picked up for
>> -stable,
>> 
>> What are the exact symptoms, and which Apple SSD is this?
>> 
>> I do find this:
>> 
>>   https://lore.kernel.org/all/cjJiSFV77WM51ciS8EuBcdeBcv9T83PUB-Kw3yi8PuC_LwrrUUnQ3w5RC1PbKvSYE72KryXp3wOJhv4Ov_WWIe2gKWOOo5uwuUjbbFA8HDM=@protonmail.com/
>> 
>> which instead of a revert has an actual patch. Can you try that one?
>> 
>> Keith Busch replied to that one, saying that the Apple SSD might not
>> be spec compliant, but hey, what else is new? If we start demanding
>> that hardware comply with specs, we'd have to scrap the whole notion
>> of working in the real world. Plus it would be very hypocritical of
>> us, since we ignore all specs when we deem them too limiting (whether
>> they be language specs, POSIX OS specs, or whatever).
>
> Right, we have a lot of quirks for the apple controllers, what's one
> more? :)
>
> Could the following patch be tried? I'm basing this off the 'lspci' from
> Orlando, but I'm assuming the previous model has the same limitation,
> too.
>
> ---
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 7efb31b87f37..f0787233557f 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -979,6 +979,7 @@ EXPORT_SYMBOL_GPL(nvme_cleanup_cmd);
>  blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req)
>  {
>  	struct nvme_command *cmd = nvme_req(req)->cmd;
> +	struct nvme_ctrl *ctrl = nvme_req(req)->ctrl;
>  	blk_status_t ret = BLK_STS_OK;
> 
>  	if (!(req->rq_flags & RQF_DONTPREP)) {
> @@ -1027,7 +1028,8 @@ blk_status_t nvme_setup_cmd(struct nvme_ns *ns, 
> struct request *req)
>  		return BLK_STS_IOERR;
>  	}
> 
> -	nvme_req(req)->genctr++;
> +	if (!(ctrl->quirks & NVME_QUIRK_SKIP_CID_GEN))
> +		nvme_req(req)->genctr++;
>  	cmd->common.command_id = nvme_cid(req);
>  	trace_nvme_setup_cmd(req, cmd);
>  	return ret;
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index 9871c0c9374c..b49761d30df7 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -86,6 +86,12 @@ enum nvme_quirks {
>  	 */
>  	NVME_QUIRK_NO_DEEPEST_PS		= (1 << 5),
> 
> +	/*
> +	 * The controller requires the command_id value be be limited to the
> +	 * queue depth.
> +	 */
> +	NVME_QUIRK_SKIP_CID_GEN			= (1 << 6),
> +
>  	/*
>  	 * Set MEDIUM priority on SQ creation
>  	 */
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index b82492cd7503..d9f22ed68185 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -3369,7 +3369,10 @@ static const struct pci_device_id nvme_id_table[] = {
>  	{ PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2005),
>  		.driver_data = NVME_QUIRK_SINGLE_VECTOR |
>  				NVME_QUIRK_128_BYTES_SQES |
> -				NVME_QUIRK_SHARED_TAGS },
> +				NVME_QUIRK_SHARED_TAGS ,
> +				NVME_QUIRK_SKIP_CID_GEN },
> +	{ PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2006),
> +		.driver_data = NVME_QUIRK_SKIP_CID_GEN },
> 
>  	{ PCI_DEVICE_CLASS(PCI_CLASS_STORAGE_EXPRESS, 0xffffff) },
>  	{ 0, }
> --

-- 
Sven Peter

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Urgent bug fix causing Apple SSDs to not work.
  2021-09-25 19:54   ` Keith Busch
  2021-09-25 20:08     ` Sven Peter
@ 2021-09-25 20:34     ` Sagi Grimberg
  2021-09-25 20:48       ` Keith Busch
       [not found]       ` <PNZPR01MB4415F5304CAA1A83442549DBB8A69@PNZPR01MB4415.INDPRD01.PROD.OUTLOOK.COM>
  1 sibling, 2 replies; 9+ messages in thread
From: Sagi Grimberg @ 2021-09-25 20:34 UTC (permalink / raw)
  To: Keith Busch, Linus Torvalds
  Cc: Aditya Garg, axboe, hch, linux-nvme, james.smart,
	chaitanya.kulkarni, akpm, linux-kernel, trivial

>>> From: Aditya Garg <gargaditya08@live.com>
>>> Date: Fri, 24 Sep 2021 15:36:45 +0530
>>> Subject: [PATCH] Revert nvme to 5.14.5 to fix incompatibility arised in Apple SSDs.
>>> Fixes: e7006de6c238 (nvme: code command_id with a genctr for use-after-free validation)
>>
>> I think we need to hear more about the problem than just revert a
>> commit like this randomly. That commit has already been picked up for
>> -stable,
>>
>> What are the exact symptoms, and which Apple SSD is this?
>>
>> I do find this:
>>
>>    https://lore.kernel.org/all/cjJiSFV77WM51ciS8EuBcdeBcv9T83PUB-Kw3yi8PuC_LwrrUUnQ3w5RC1PbKvSYE72KryXp3wOJhv4Ov_WWIe2gKWOOo5uwuUjbbFA8HDM=@protonmail.com/
>>
>> which instead of a revert has an actual patch. Can you try that one?
>>
>> Keith Busch replied to that one, saying that the Apple SSD might not
>> be spec compliant, but hey, what else is new? If we start demanding
>> that hardware comply with specs, we'd have to scrap the whole notion
>> of working in the real world. Plus it would be very hypocritical of
>> us, since we ignore all specs when we deem them too limiting (whether
>> they be language specs, POSIX OS specs, or whatever).
> 
> Right, we have a lot of quirks for the apple controllers, what's one
> more? :)
> 
> Could the following patch be tried? I'm basing this off the 'lspci' from
> Orlando, but I'm assuming the previous model has the same limitation,
> too.
> 
> ---
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 7efb31b87f37..f0787233557f 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -979,6 +979,7 @@ EXPORT_SYMBOL_GPL(nvme_cleanup_cmd);
>   blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req)
>   {
>   	struct nvme_command *cmd = nvme_req(req)->cmd;
> +	struct nvme_ctrl *ctrl = nvme_req(req)->ctrl;
>   	blk_status_t ret = BLK_STS_OK;
>   
>   	if (!(req->rq_flags & RQF_DONTPREP)) {
> @@ -1027,7 +1028,8 @@ blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req)
>   		return BLK_STS_IOERR;
>   	}
>   
> -	nvme_req(req)->genctr++;
> +	if (!(ctrl->quirks & NVME_QUIRK_SKIP_CID_GEN))
> +		nvme_req(req)->genctr++;
>   	cmd->common.command_id = nvme_cid(req);
>   	trace_nvme_setup_cmd(req, cmd);
>   	return ret;
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index 9871c0c9374c..b49761d30df7 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -86,6 +86,12 @@ enum nvme_quirks {
>   	 */
>   	NVME_QUIRK_NO_DEEPEST_PS		= (1 << 5),
>   
> +	/*
> +	 * The controller requires the command_id value be be limited to the
> +	 * queue depth.
> +	 */
> +	NVME_QUIRK_SKIP_CID_GEN			= (1 << 6),
> +
>   	/*
>   	 * Set MEDIUM priority on SQ creation
>   	 */
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index b82492cd7503..d9f22ed68185 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -3369,7 +3369,10 @@ static const struct pci_device_id nvme_id_table[] = {
>   	{ PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2005),
>   		.driver_data = NVME_QUIRK_SINGLE_VECTOR |
>   				NVME_QUIRK_128_BYTES_SQES |
> -				NVME_QUIRK_SHARED_TAGS },
> +				NVME_QUIRK_SHARED_TAGS ,
> +				NVME_QUIRK_SKIP_CID_GEN },
> +	{ PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2006),
> +		.driver_data = NVME_QUIRK_SKIP_CID_GEN },
>   
>   	{ PCI_DEVICE_CLASS(PCI_CLASS_STORAGE_EXPRESS, 0xffffff) },
>   	{ 0, }
> --
> 

Exactly the patch that I was going to propose. The only downside
is that we need to access the controller in the hot-path...

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Urgent bug fix causing Apple SSDs to not work.
  2021-09-25 20:08     ` Sven Peter
@ 2021-09-25 20:38       ` Keith Busch
  2021-09-25 20:48         ` Sven Peter
  0 siblings, 1 reply; 9+ messages in thread
From: Keith Busch @ 2021-09-25 20:38 UTC (permalink / raw)
  To: Sven Peter
  Cc: Linus Torvalds, Aditya Garg, axboe, hch, sagi, linux-nvme,
	james.smart, chaitanya.kulkarni, akpm, linux-kernel, trivial

On Sat, Sep 25, 2021 at 10:08:53PM +0200, Sven Peter wrote:
> I actually ran into a similar issue while adding support for the NVMe
> controller found on the M1 and assumed it was only present there.
> 
> Some background why this happens: ANS2 is a co-processor that emulates
> an NVMe MMIO interface and uses the tag as an index to an internal data
> structure. 

Thanks for confirming the behavior. The patch should restore the
command_id values to when everything was working. I'll just need to
update the quirk description to better align with the actual limitation
if the patch is successful. 

> On the M1 we can directly talk to ANS2 and while we can submit
> commands with a higher index it'll just ignore the upper bits and only
> return the lowest eight IIRC in the completion queue.
> I guess whatever software is running on the T2 actually has an assert to
> ensure that the tag is within the limits before forwarding the command
> to ANS2.

Is the PCI Device ID for the M1 the same as reported for the T2? Either
0x2005 or 0x2003 should make this quirk apply.

Side note, I mistakenly added an entry for device ID 0x2006, but that
was from me misreading the report.

> Best,
> 
> Sven
> 
> 
> On Sat, Sep 25, 2021, at 21:54, Keith Busch wrote:
> > On Sat, Sep 25, 2021 at 11:47:08AM -0700, Linus Torvalds wrote:
> >> On Fri, Sep 24, 2021 at 9:02 PM Aditya Garg <gargaditya08@live.com> wrote:
> >> >
> >> > From: Aditya Garg <gargaditya08@live.com>
> >> > Date: Fri, 24 Sep 2021 15:36:45 +0530
> >> > Subject: [PATCH] Revert nvme to 5.14.5 to fix incompatibility arised in Apple SSDs.
> >> > Fixes: e7006de6c238 (nvme: code command_id with a genctr for use-after-free validation)
> >> 
> >> I think we need to hear more about the problem than just revert a
> >> commit like this randomly. That commit has already been picked up for
> >> -stable,
> >> 
> >> What are the exact symptoms, and which Apple SSD is this?
> >> 
> >> I do find this:
> >> 
> >>   https://lore.kernel.org/all/cjJiSFV77WM51ciS8EuBcdeBcv9T83PUB-Kw3yi8PuC_LwrrUUnQ3w5RC1PbKvSYE72KryXp3wOJhv4Ov_WWIe2gKWOOo5uwuUjbbFA8HDM=@protonmail.com/
> >> 
> >> which instead of a revert has an actual patch. Can you try that one?
> >> 
> >> Keith Busch replied to that one, saying that the Apple SSD might not
> >> be spec compliant, but hey, what else is new? If we start demanding
> >> that hardware comply with specs, we'd have to scrap the whole notion
> >> of working in the real world. Plus it would be very hypocritical of
> >> us, since we ignore all specs when we deem them too limiting (whether
> >> they be language specs, POSIX OS specs, or whatever).
> >
> > Right, we have a lot of quirks for the apple controllers, what's one
> > more? :)
> >
> > Could the following patch be tried? I'm basing this off the 'lspci' from
> > Orlando, but I'm assuming the previous model has the same limitation,
> > too.
> >
> > ---
> > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> > index 7efb31b87f37..f0787233557f 100644
> > --- a/drivers/nvme/host/core.c
> > +++ b/drivers/nvme/host/core.c
> > @@ -979,6 +979,7 @@ EXPORT_SYMBOL_GPL(nvme_cleanup_cmd);
> >  blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req)
> >  {
> >  	struct nvme_command *cmd = nvme_req(req)->cmd;
> > +	struct nvme_ctrl *ctrl = nvme_req(req)->ctrl;
> >  	blk_status_t ret = BLK_STS_OK;
> > 
> >  	if (!(req->rq_flags & RQF_DONTPREP)) {
> > @@ -1027,7 +1028,8 @@ blk_status_t nvme_setup_cmd(struct nvme_ns *ns, 
> > struct request *req)
> >  		return BLK_STS_IOERR;
> >  	}
> > 
> > -	nvme_req(req)->genctr++;
> > +	if (!(ctrl->quirks & NVME_QUIRK_SKIP_CID_GEN))
> > +		nvme_req(req)->genctr++;
> >  	cmd->common.command_id = nvme_cid(req);
> >  	trace_nvme_setup_cmd(req, cmd);
> >  	return ret;
> > diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> > index 9871c0c9374c..b49761d30df7 100644
> > --- a/drivers/nvme/host/nvme.h
> > +++ b/drivers/nvme/host/nvme.h
> > @@ -86,6 +86,12 @@ enum nvme_quirks {
> >  	 */
> >  	NVME_QUIRK_NO_DEEPEST_PS		= (1 << 5),
> > 
> > +	/*
> > +	 * The controller requires the command_id value be be limited to the
> > +	 * queue depth.
> > +	 */
> > +	NVME_QUIRK_SKIP_CID_GEN			= (1 << 6),
> > +
> >  	/*
> >  	 * Set MEDIUM priority on SQ creation
> >  	 */
> > diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> > index b82492cd7503..d9f22ed68185 100644
> > --- a/drivers/nvme/host/pci.c
> > +++ b/drivers/nvme/host/pci.c
> > @@ -3369,7 +3369,10 @@ static const struct pci_device_id nvme_id_table[] = {
> >  	{ PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2005),
> >  		.driver_data = NVME_QUIRK_SINGLE_VECTOR |
> >  				NVME_QUIRK_128_BYTES_SQES |
> > -				NVME_QUIRK_SHARED_TAGS },
> > +				NVME_QUIRK_SHARED_TAGS ,
> > +				NVME_QUIRK_SKIP_CID_GEN },
> > +	{ PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2006),
> > +		.driver_data = NVME_QUIRK_SKIP_CID_GEN },
> > 
> >  	{ PCI_DEVICE_CLASS(PCI_CLASS_STORAGE_EXPRESS, 0xffffff) },
> >  	{ 0, }
> > --
> 
> -- 
> Sven Peter

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Urgent bug fix causing Apple SSDs to not work.
  2021-09-25 20:38       ` Keith Busch
@ 2021-09-25 20:48         ` Sven Peter
  0 siblings, 0 replies; 9+ messages in thread
From: Sven Peter @ 2021-09-25 20:48 UTC (permalink / raw)
  To: Keith Busch
  Cc: Linus Torvalds, Aditya Garg, axboe, hch, sagi, linux-nvme,
	james.smart, chaitanya.kulkarni, akpm, linux-kernel, trivial



On Sat, Sep 25, 2021, at 22:38, Keith Busch wrote:
> On Sat, Sep 25, 2021 at 10:08:53PM +0200, Sven Peter wrote:
>> I actually ran into a similar issue while adding support for the NVMe
>> controller found on the M1 and assumed it was only present there.
>> 
>> Some background why this happens: ANS2 is a co-processor that emulates
>> an NVMe MMIO interface and uses the tag as an index to an internal data
>> structure. 
>
> Thanks for confirming the behavior. The patch should restore the
> command_id values to when everything was working. I'll just need to
> update the quirk description to better align with the actual limitation
> if the patch is successful. 
>
>> On the M1 we can directly talk to ANS2 and while we can submit
>> commands with a higher index it'll just ignore the upper bits and only
>> return the lowest eight IIRC in the completion queue.
>> I guess whatever software is running on the T2 actually has an assert to
>> ensure that the tag is within the limits before forwarding the command
>> to ANS2.
>
> Is the PCI Device ID for the M1 the same as reported for the T2? Either
> 0x2005 or 0x2003 should make this quirk apply.

Oh, it's worse. It's no longer a PCIe device there but a platform device
directly attached to the SoC's bus. It'll need almost all T2 quirks and then
some additional changes to the way command submission works and support to
program a tightly coupled IOMMU which Apple calls NVMMU (which probably is another
reason for this limit: they use the command id to find the corresponding entry in
the IOMMU that allows access to memory listed in the PRPs).
I'll just use the quirk in my platform driver and submit it once I get it into a
shape that can actually be reviewed.



Sven


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Urgent bug fix causing Apple SSDs to not work.
  2021-09-25 20:34     ` Sagi Grimberg
@ 2021-09-25 20:48       ` Keith Busch
       [not found]       ` <PNZPR01MB4415F5304CAA1A83442549DBB8A69@PNZPR01MB4415.INDPRD01.PROD.OUTLOOK.COM>
  1 sibling, 0 replies; 9+ messages in thread
From: Keith Busch @ 2021-09-25 20:48 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Linus Torvalds, Aditya Garg, axboe, hch, linux-nvme, james.smart,
	chaitanya.kulkarni, akpm, linux-kernel, trivial

On Sat, Sep 25, 2021 at 11:34:31PM +0300, Sagi Grimberg wrote:
> Exactly the patch that I was going to propose. The only downside
> is that we need to access the controller in the hot-path...

I initially had the same concern but noticed we already access 'ctrl' in
nvme_setup_rw() in this path, so I don't think we will be able to
measure the cost of the added check.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Urgent bug fix causing Apple SSDs to not work.
       [not found]         ` <PNZPR01MB4415986CC44AACC100AD1189B8A69@PNZPR01MB4415.INDPRD01.PROD.OUTLOOK.COM>
@ 2021-09-26  8:44           ` Orlando Chamberlain
  2021-09-27  4:35             ` Aditya Garg
  0 siblings, 1 reply; 9+ messages in thread
From: Orlando Chamberlain @ 2021-09-26  8:44 UTC (permalink / raw)
  To: Aditya Garg, sagi, kbusch
  Cc: torvalds, axboe, hch, linux-nvme, james.smart,
	chaitanya.kulkarni, akpm, linux-kernel, trivial

On 26/9/21 18:07, Aditya Garg wrote:
> I checked out the proposal sent by Orlando Chamberlain to replace 
> NVME_QUIRK_SHARED_TAGS , by NVME_QUIRK_SHARED_TAGS | given in the patch on 
> http://lists.infradead.org/pipermail/linux-nvme/2021-September/027665.html 
> <http://lists.infradead.org/pipermail/linux-nvme/2021-September/027665.html>. 
> The , still causes panics to the T2 as described before. In the case of |, the 
> kernel boots correctly without panicking the T2, but in case we are having Linux 
> on an External Drive, which is my case, then the internal SSD doesn't seem to be 
> recognised at all. I've tested the patch on 5.14.7. A screenshot of the GNOME 
> Disks utility has been attached along with this mail.

I had some issues with the patch not applying automatically as I mixed up tabs and spaces,
(due to me copying stuff directly out of vim), so I had issues with getting patches to apply
automatically. I've checked that this file https://github.com/Redecorating/mbp-16.1-linux-wifi/blob/anbox-bt/1001-apple-t2-disable-nvme-gen-counter-quirk.patch
applies correctly for me (and github ci), so can you see if it still doesn't work with that?


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Urgent bug fix causing Apple SSDs to not work.
  2021-09-26  8:44           ` Orlando Chamberlain
@ 2021-09-27  4:35             ` Aditya Garg
  0 siblings, 0 replies; 9+ messages in thread
From: Aditya Garg @ 2021-09-27  4:35 UTC (permalink / raw)
  To: linux-nvme, linux-kernel

I tried Orlando Chamberlain's patch on 5.14.7 but still the SSD is not being recognised.

________________________________________
From: Orlando Chamberlain <redecorating@protonmail.com>
Sent: Sunday, September 26, 2021 8:44 AM
To: Aditya Garg; sagi@grimberg.me; kbusch@kernel.org
Cc: torvalds@linux-foundation.org; axboe@fb.com; hch@lst.de; linux-nvme@lists.infradead.org; james.smart@broadcom.com; chaitanya.kulkarni@wdc.com; akpm@linux-foundation.org; linux-kernel@vger.kernel.org; trivial@kernel.org
Subject: Re: [PATCH] Urgent bug fix causing Apple SSDs to not work.

On 26/9/21 18:07, Aditya Garg wrote:
> I checked out the proposal sent by Orlando Chamberlain to replace
> NVME_QUIRK_SHARED_TAGS , by NVME_QUIRK_SHARED_TAGS | given in the patch on
> http://lists.infradead.org/pipermail/linux-nvme/2021-September/027665.html
> <http://lists.infradead.org/pipermail/linux-nvme/2021-September/027665.html>.
> The , still causes panics to the T2 as described before. In the case of |, the
> kernel boots correctly without panicking the T2, but in case we are having Linux
> on an External Drive, which is my case, then the internal SSD doesn't seem to be
> recognised at all. I've tested the patch on 5.14.7. A screenshot of the GNOME
> Disks utility has been attached along with this mail.

I had some issues with the patch not applying automatically as I mixed up tabs and spaces,
(due to me copying stuff directly out of vim), so I had issues with getting patches to apply
automatically. I've checked that this file https://github.com/Redecorating/mbp-16.1-linux-wifi/blob/anbox-bt/1001-apple-t2-disable-nvme-gen-counter-quirk.patch
applies correctly for me (and github ci), so can you see if it still doesn't work with that?


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-09-27  4:35 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <PNZPR01MB4415600ACD3C8D9944F15058B8A59@PNZPR01MB4415.INDPRD01.PROD.OUTLOOK.COM>
2021-09-25 18:47 ` [PATCH] Urgent bug fix causing Apple SSDs to not work Linus Torvalds
2021-09-25 19:54   ` Keith Busch
2021-09-25 20:08     ` Sven Peter
2021-09-25 20:38       ` Keith Busch
2021-09-25 20:48         ` Sven Peter
2021-09-25 20:34     ` Sagi Grimberg
2021-09-25 20:48       ` Keith Busch
     [not found]       ` <PNZPR01MB4415F5304CAA1A83442549DBB8A69@PNZPR01MB4415.INDPRD01.PROD.OUTLOOK.COM>
     [not found]         ` <PNZPR01MB4415986CC44AACC100AD1189B8A69@PNZPR01MB4415.INDPRD01.PROD.OUTLOOK.COM>
2021-09-26  8:44           ` Orlando Chamberlain
2021-09-27  4:35             ` Aditya Garg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).