linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [REGRESSION] nvme: code command_id with a genctr for use-after-free validation crashes apple T2 SSD
       [not found]               ` <31c4dc69-5d10-cc6a-4295-e42bbc0993d0@protonmail.com>
@ 2021-09-27  4:51                 ` Aditya Garg
  2021-09-27  6:05                   ` Sven Peter
  0 siblings, 1 reply; 3+ messages in thread
From: Aditya Garg @ 2021-09-27  4:51 UTC (permalink / raw)
  To: Orlando Chamberlain, sagi, kbusch
  Cc: linux-nvme, regressions, hare, dwagner, hch, torvalds, axboe,
	james.smart, chaitanya.kulkarni, akpm, linux-kernel

I am getting the same error.

________________________________________
From: Orlando Chamberlain <redecorating@protonmail.com>
Sent: Monday, September 27, 2021 4:22 AM
To: Sagi Grimberg; Aditya Garg; kbusch@kernel.org
Cc: linux-nvme@lists.infradead.org; regressions@lists.linux.dev; hare@suse.de; dwagner@suse.de; hch@lst.de
Subject: Re: [REGRESSION] nvme: code command_id with a genctr for use-after-free validation crashes apple T2 SSD

On 26/9/21 18:44, Sagi Grimberg wrote:
>
>> I checked out the proposal sent by Orlando Chamberlain to replace NVME_QUIRK_SHARED_TAGS , by NVME_QUIRK_SHARED_TAGS | given in the patch on http://lists.infradead.org/pipermail/linux-nvme/2021-September/027665.html. The , still causes panics to the T2 as described before. In the case of |, the kernel boots correctly without panicking the T2, but in case we are having Linux on an External Drive, which is my case, then the internal SSD doesn't seem to be recognised at all. I've tested the patch on 5.14.7.
>
> That sounds like a separate issue, because with this patch applied,
> all tags should be within the queue entry range (with generation
> set to 0 always).
>
> Is it possible that the io_queue_depth is being set to something
> that exceeds NVME_PCI_MAX_QUEUE_SIZE (4095) ? the default is 1024
>
I've been able to reproduce it by using the same kernel Aditya is using:
https://github.com/AdityaGarg8/T2-Big-Sur-Ubuntu-Kernel/actions/runs/1275383460

From the initramfs:

# dmesg | grep nvme
nvme nvme0: pci function 0000:04:00.0
nvme nvme0: 1/0/0 default/read/poll queues
nvme nvme0: Identify NS List failed (status=0xb)
nvme nvme0: LightNVM init failure

It might be because this is 5.14.7, while I've been using 5.15-rc2. Additionally,
there are differences in kernel configs, I've put both configs in this gist
https://gist.github.com/Redecorating/c8cf574df969f9b4f626dfb9c6b2a758


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [REGRESSION] nvme: code command_id with a genctr for use-after-free validation crashes apple T2 SSD
  2021-09-27  4:51                 ` [REGRESSION] nvme: code command_id with a genctr for use-after-free validation crashes apple T2 SSD Aditya Garg
@ 2021-09-27  6:05                   ` Sven Peter
  2021-09-27 15:02                     ` Keith Busch
  0 siblings, 1 reply; 3+ messages in thread
From: Sven Peter @ 2021-09-27  6:05 UTC (permalink / raw)
  To: Aditya Garg, Orlando Chamberlain, sagi, Keith Busch
  Cc: linux-nvme, regressions, hare, dwagner, hch, Linus Torvalds,
	axboe, james.smart, chaitanya.kulkarni, akpm, linux-kernel

Hi,


On Mon, Sep 27, 2021, at 06:51, Aditya Garg wrote:
> I am getting the same error.
>
> ________________________________________
> From: Orlando Chamberlain <redecorating@protonmail.com>
> Sent: Monday, September 27, 2021 4:22 AM
> To: Sagi Grimberg; Aditya Garg; kbusch@kernel.org
> Cc: linux-nvme@lists.infradead.org; regressions@lists.linux.dev; 
> hare@suse.de; dwagner@suse.de; hch@lst.de
> Subject: Re: [REGRESSION] nvme: code command_id with a genctr for 
> use-after-free validation crashes apple T2 SSD
>
> On 26/9/21 18:44, Sagi Grimberg wrote:
>>
>>> I checked out the proposal sent by Orlando Chamberlain to replace NVME_QUIRK_SHARED_TAGS , by NVME_QUIRK_SHARED_TAGS | given in the patch on http://lists.infradead.org/pipermail/linux-nvme/2021-September/027665.html. The , still causes panics to the T2 as described before. In the case of |, the kernel boots correctly without panicking the T2, but in case we are having Linux on an External Drive, which is my case, then the internal SSD doesn't seem to be recognised at all. I've tested the patch on 5.14.7.
>>
>> That sounds like a separate issue, because with this patch applied,
>> all tags should be within the queue entry range (with generation
>> set to 0 always).
>>
>> Is it possible that the io_queue_depth is being set to something
>> that exceeds NVME_PCI_MAX_QUEUE_SIZE (4095) ? the default is 1024
>>
> I've been able to reproduce it by using the same kernel Aditya is using:
> https://github.com/AdityaGarg8/T2-Big-Sur-Ubuntu-Kernel/actions/runs/1275383460
>
> From the initramfs:
>
> # dmesg | grep nvme
> nvme nvme0: pci function 0000:04:00.0
> nvme nvme0: 1/0/0 default/read/poll queues
> nvme nvme0: Identify NS List failed (status=0xb)
> nvme nvme0: LightNVM init failure

Maybe I should've just submitted the quirks required for the M1 already...
The ANS2 firmware on there doesn't support the vanilla nvme_scan_ns_list
call. That should not break anything though because core.c falls back to
nvme_scan_ns_sequential in that case which does work. It just results
in that "Identify NS List failed (status=0xb)" error message.

Not sure where that LightNVM failure comes from though. Afaict lightnvm has been 
removed from 5.15 and shouldn't be used for the Apple controller anyway.

Sven

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [REGRESSION] nvme: code command_id with a genctr for use-after-free validation crashes apple T2 SSD
  2021-09-27  6:05                   ` Sven Peter
@ 2021-09-27 15:02                     ` Keith Busch
  0 siblings, 0 replies; 3+ messages in thread
From: Keith Busch @ 2021-09-27 15:02 UTC (permalink / raw)
  To: Sven Peter
  Cc: Aditya Garg, Orlando Chamberlain, sagi, linux-nvme, regressions,
	hare, dwagner, hch, Linus Torvalds, axboe, james.smart,
	chaitanya.kulkarni, akpm, linux-kernel

On Mon, Sep 27, 2021 at 08:05:00AM +0200, Sven Peter wrote:
> Hi,
> 
> 
> On Mon, Sep 27, 2021, at 06:51, Aditya Garg wrote:
> > I am getting the same error.
> >
> > ________________________________________
> > From: Orlando Chamberlain <redecorating@protonmail.com>
> > Sent: Monday, September 27, 2021 4:22 AM
> > To: Sagi Grimberg; Aditya Garg; kbusch@kernel.org
> > Cc: linux-nvme@lists.infradead.org; regressions@lists.linux.dev; 
> > hare@suse.de; dwagner@suse.de; hch@lst.de
> > Subject: Re: [REGRESSION] nvme: code command_id with a genctr for 
> > use-after-free validation crashes apple T2 SSD
> >
> > On 26/9/21 18:44, Sagi Grimberg wrote:
> >>
> >>> I checked out the proposal sent by Orlando Chamberlain to replace NVME_QUIRK_SHARED_TAGS , by NVME_QUIRK_SHARED_TAGS | given in the patch on http://lists.infradead.org/pipermail/linux-nvme/2021-September/027665.html. The , still causes panics to the T2 as described before. In the case of |, the kernel boots correctly without panicking the T2, but in case we are having Linux on an External Drive, which is my case, then the internal SSD doesn't seem to be recognised at all. I've tested the patch on 5.14.7.
> >>
> >> That sounds like a separate issue, because with this patch applied,
> >> all tags should be within the queue entry range (with generation
> >> set to 0 always).
> >>
> >> Is it possible that the io_queue_depth is being set to something
> >> that exceeds NVME_PCI_MAX_QUEUE_SIZE (4095) ? the default is 1024
> >>
> > I've been able to reproduce it by using the same kernel Aditya is using:
> > https://github.com/AdityaGarg8/T2-Big-Sur-Ubuntu-Kernel/actions/runs/1275383460
> >
> > From the initramfs:
> >
> > # dmesg | grep nvme
> > nvme nvme0: pci function 0000:04:00.0
> > nvme nvme0: 1/0/0 default/read/poll queues
> > nvme nvme0: Identify NS List failed (status=0xb)
> > nvme nvme0: LightNVM init failure
> 
> Maybe I should've just submitted the quirks required for the M1 already...
> The ANS2 firmware on there doesn't support the vanilla nvme_scan_ns_list
> call. That should not break anything though because core.c falls back to
> nvme_scan_ns_sequential in that case which does work. It just results
> in that "Identify NS List failed (status=0xb)" error message.
> 
> Not sure where that LightNVM failure comes from though. Afaict lightnvm has been 
> removed from 5.15 and shouldn't be used for the Apple controller anyway.

Yes, lightnvm was removed in 5.15, so the quirk bit identifying these
devices was removed. The patch was made for upstream 5.15. If you want
to backport it to stable, the bit will need to be changed, otherwise it
make the driver think its an openchannel SSD instead and fail.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-09-27 15:02 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <cjJiSFV77WM51ciS8EuBcdeBcv9T83PUB-Kw3yi8PuC_LwrrUUnQ3w5RC1PbKvSYE72KryXp3wOJhv4Ov_WWIe2gKWOOo5uwuUjbbFA8HDM=@protonmail.com>
     [not found] ` <20210925171618.GA116968@dhcp-10-100-145-180.wdc.com>
     [not found]   ` <fa9de055-c3b8-20d3-41e0-12e43d0c336a@protonmail.com>
     [not found]     ` <20210926020839.GA96176@C02WT3WMHTD6>
     [not found]       ` <1a6f5030-27d9-d1ae-aff4-0ed2a10dce6b@protonmail.com>
     [not found]         ` <1b5d6bef-db6f-073f-8d24-4963f0df82ab@protonmail.com>
     [not found]           ` <PNZPR01MB4415801C6084E8CFD068A84AB8A69@PNZPR01MB4415.INDPRD01.PROD.OUTLOOK.COM>
     [not found]             ` <d65ecc69-35c9-4400-8fb0-95aa04360b03@grimberg.me>
     [not found]               ` <31c4dc69-5d10-cc6a-4295-e42bbc0993d0@protonmail.com>
2021-09-27  4:51                 ` [REGRESSION] nvme: code command_id with a genctr for use-after-free validation crashes apple T2 SSD Aditya Garg
2021-09-27  6:05                   ` Sven Peter
2021-09-27 15:02                     ` Keith Busch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).