All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] nvme: check for valid data from from nvme_identify_ns() before using it
@ 2023-11-27 20:56 Ewan D. Milne
  2023-11-27 22:03 ` Keith Busch
  0 siblings, 1 reply; 5+ messages in thread
From: Ewan D. Milne @ 2023-11-27 20:56 UTC (permalink / raw)
  To: linux-nvme; +Cc: kbusch

When scanning namespaces, it is possible to get valid data from the first
call to nvme_identify_ns() in nvme_alloc_ns(), but not from the second
call in nvme_update_ns_info_block().  In particular, if the NSID becomes
inactive between the two commands, a storage device may return a buffer
filled with zero as per 4.1.5.1.  In this case, we can get a kernel crash
due to a divide-by-zero in blk_stack_limits() because ns->lba_shift will
be set to zero.

PID: 326      TASK: ffff95fec3cd8000  CPU: 29   COMMAND: "kworker/u98:10"
 #0 [ffffad8f8702f9e0] machine_kexec at ffffffff91c76ec7
 #1 [ffffad8f8702fa38] __crash_kexec at ffffffff91dea4fa
 #2 [ffffad8f8702faf8] crash_kexec at ffffffff91deb788
 #3 [ffffad8f8702fb00] oops_end at ffffffff91c2e4bb
 #4 [ffffad8f8702fb20] do_trap at ffffffff91c2a4ce
 #5 [ffffad8f8702fb70] do_error_trap at ffffffff91c2a595
 #6 [ffffad8f8702fbb0] exc_divide_error at ffffffff928506e6
 #7 [ffffad8f8702fbd0] asm_exc_divide_error at ffffffff92a00926
    [exception RIP: blk_stack_limits+434]
    RIP: ffffffff92191872  RSP: ffffad8f8702fc80  RFLAGS: 00010246
    RAX: 0000000000000000  RBX: ffff95efa0c91800  RCX: 0000000000000001
    RDX: 0000000000000000  RSI: 0000000000000001  RDI: 0000000000000001
    RBP: 00000000ffffffff   R8: ffff95fec7df35a8   R9: 0000000000000000
    R10: 0000000000000000  R11: 0000000000000001  R12: 0000000000000000
    R13: 0000000000000000  R14: 0000000000000000  R15: ffff95fed33c09a8
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #8 [ffffad8f8702fce0] nvme_update_ns_info_block at ffffffffc06d3533 [nvme_core]
 #9 [ffffad8f8702fd18] nvme_scan_ns at ffffffffc06d6fa7 [nvme_core]

This happened when the check for valid data was moved out of nvme_identify_ns()
into one of the callers.  Fix this by checking in both callers.

v2: call kfree() on nvme_id_ns struct in error path

Fixes: 0dd6fff2aad4 ("nvme: bring back auto-removal of deleted namespaces during sequential scan")
Cc: stable@vger.kernel.org
Signed-off-by: Ewan D. Milne <emilne@redhat.com>
---
 drivers/nvme/host/core.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index af94d067ac14..a55c2a774b9c 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2040,6 +2040,13 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
 	if (ret)
 		return ret;
 
+	if (id->ncap == 0) {
+		/* namespace not allocated or attached */
+		info->is_removed = true;
+		ret = -ENODEV;
+		goto error;
+	}
+
 	blk_mq_freeze_queue(ns->disk->queue);
 	lbaf = nvme_lbaf_index(id->flbas);
 	ns->lba_shift = id->lbaf[lbaf].ds;
@@ -2101,6 +2108,8 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
 		set_bit(NVME_NS_READY, &ns->flags);
 		ret = 0;
 	}
+
+error:
 	kfree(id);
 	return ret;
 }
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] nvme: check for valid data from from nvme_identify_ns() before using it
  2023-11-27 20:56 [PATCH v2] nvme: check for valid data from from nvme_identify_ns() before using it Ewan D. Milne
@ 2023-11-27 22:03 ` Keith Busch
  2023-11-28 16:10   ` Ewan Milne
  0 siblings, 1 reply; 5+ messages in thread
From: Keith Busch @ 2023-11-27 22:03 UTC (permalink / raw)
  To: Ewan D. Milne; +Cc: linux-nvme

On Mon, Nov 27, 2023 at 03:56:57PM -0500, Ewan D. Milne wrote:
> When scanning namespaces, it is possible to get valid data from the first
> call to nvme_identify_ns() in nvme_alloc_ns(), but not from the second
> call in nvme_update_ns_info_block().  In particular, if the NSID becomes
> inactive between the two commands, a storage device may return a buffer
> filled with zero as per 4.1.5.1.  In this case, we can get a kernel crash
> due to a divide-by-zero in blk_stack_limits() because ns->lba_shift will
> be set to zero.
> 
> PID: 326      TASK: ffff95fec3cd8000  CPU: 29   COMMAND: "kworker/u98:10"
>  #0 [ffffad8f8702f9e0] machine_kexec at ffffffff91c76ec7
>  #1 [ffffad8f8702fa38] __crash_kexec at ffffffff91dea4fa
>  #2 [ffffad8f8702faf8] crash_kexec at ffffffff91deb788
>  #3 [ffffad8f8702fb00] oops_end at ffffffff91c2e4bb
>  #4 [ffffad8f8702fb20] do_trap at ffffffff91c2a4ce
>  #5 [ffffad8f8702fb70] do_error_trap at ffffffff91c2a595
>  #6 [ffffad8f8702fbb0] exc_divide_error at ffffffff928506e6
>  #7 [ffffad8f8702fbd0] asm_exc_divide_error at ffffffff92a00926
>     [exception RIP: blk_stack_limits+434]
>     RIP: ffffffff92191872  RSP: ffffad8f8702fc80  RFLAGS: 00010246
>     RAX: 0000000000000000  RBX: ffff95efa0c91800  RCX: 0000000000000001
>     RDX: 0000000000000000  RSI: 0000000000000001  RDI: 0000000000000001
>     RBP: 00000000ffffffff   R8: ffff95fec7df35a8   R9: 0000000000000000
>     R10: 0000000000000000  R11: 0000000000000001  R12: 0000000000000000
>     R13: 0000000000000000  R14: 0000000000000000  R15: ffff95fed33c09a8
>     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
>  #8 [ffffad8f8702fce0] nvme_update_ns_info_block at ffffffffc06d3533 [nvme_core]
>  #9 [ffffad8f8702fd18] nvme_scan_ns at ffffffffc06d6fa7 [nvme_core]
> 
> This happened when the check for valid data was moved out of nvme_identify_ns()
> into one of the callers.  Fix this by checking in both callers.
> 
> v2: call kfree() on nvme_id_ns struct in error path
> 
> Fixes: 0dd6fff2aad4 ("nvme: bring back auto-removal of deleted namespaces during sequential scan")
> Cc: stable@vger.kernel.org
> Signed-off-by: Ewan D. Milne <emilne@redhat.com>

Thanks, applied for nvme-6.7.

Interestingly enough, I think this is the same as what was recently
reported here:

  https://bugzilla.kernel.org/show_bug.cgi?id=218186


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] nvme: check for valid data from from nvme_identify_ns() before using it
  2023-11-27 22:03 ` Keith Busch
@ 2023-11-28 16:10   ` Ewan Milne
  2023-11-28 17:32     ` Keith Busch
  0 siblings, 1 reply; 5+ messages in thread
From: Ewan Milne @ 2023-11-28 16:10 UTC (permalink / raw)
  To: Keith Busch; +Cc: linux-nvme

On Mon, Nov 27, 2023 at 5:03 PM Keith Busch <kbusch@kernel.org> wrote:
>
> On Mon, Nov 27, 2023 at 03:56:57PM -0500, Ewan D. Milne wrote:
> > When scanning namespaces, it is possible to get valid data from the first
> > call to nvme_identify_ns() in nvme_alloc_ns(), but not from the second
> > call in nvme_update_ns_info_block().  In particular, if the NSID becomes
> > inactive between the two commands, a storage device may return a buffer
> > filled with zero as per 4.1.5.1.  In this case, we can get a kernel crash
> > due to a divide-by-zero in blk_stack_limits() because ns->lba_shift will
> > be set to zero.
> >
> > PID: 326      TASK: ffff95fec3cd8000  CPU: 29   COMMAND: "kworker/u98:10"
> >  #0 [ffffad8f8702f9e0] machine_kexec at ffffffff91c76ec7
> >  #1 [ffffad8f8702fa38] __crash_kexec at ffffffff91dea4fa
> >  #2 [ffffad8f8702faf8] crash_kexec at ffffffff91deb788
> >  #3 [ffffad8f8702fb00] oops_end at ffffffff91c2e4bb
> >  #4 [ffffad8f8702fb20] do_trap at ffffffff91c2a4ce
> >  #5 [ffffad8f8702fb70] do_error_trap at ffffffff91c2a595
> >  #6 [ffffad8f8702fbb0] exc_divide_error at ffffffff928506e6
> >  #7 [ffffad8f8702fbd0] asm_exc_divide_error at ffffffff92a00926
> >     [exception RIP: blk_stack_limits+434]
> >     RIP: ffffffff92191872  RSP: ffffad8f8702fc80  RFLAGS: 00010246
> >     RAX: 0000000000000000  RBX: ffff95efa0c91800  RCX: 0000000000000001
> >     RDX: 0000000000000000  RSI: 0000000000000001  RDI: 0000000000000001
> >     RBP: 00000000ffffffff   R8: ffff95fec7df35a8   R9: 0000000000000000
> >     R10: 0000000000000000  R11: 0000000000000001  R12: 0000000000000000
> >     R13: 0000000000000000  R14: 0000000000000000  R15: ffff95fed33c09a8
> >     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
> >  #8 [ffffad8f8702fce0] nvme_update_ns_info_block at ffffffffc06d3533 [nvme_core]
> >  #9 [ffffad8f8702fd18] nvme_scan_ns at ffffffffc06d6fa7 [nvme_core]
> >
> > This happened when the check for valid data was moved out of nvme_identify_ns()
> > into one of the callers.  Fix this by checking in both callers.
> >
> > v2: call kfree() on nvme_id_ns struct in error path
> >
> > Fixes: 0dd6fff2aad4 ("nvme: bring back auto-removal of deleted namespaces during sequential scan")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Ewan D. Milne <emilne@redhat.com>
>
> Thanks, applied for nvme-6.7.
>
> Interestingly enough, I think this is the same as what was recently
> reported here:
>
>   https://bugzilla.kernel.org/show_bug.cgi?id=218186
>

Yes, from the stack trace and BZ comments it looks like it.

-Ewan



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] nvme: check for valid data from from nvme_identify_ns() before using it
  2023-11-28 16:10   ` Ewan Milne
@ 2023-11-28 17:32     ` Keith Busch
  2023-11-28 22:57       ` Ewan Milne
  0 siblings, 1 reply; 5+ messages in thread
From: Keith Busch @ 2023-11-28 17:32 UTC (permalink / raw)
  To: Ewan Milne; +Cc: linux-nvme

On Tue, Nov 28, 2023 at 11:10:29AM -0500, Ewan Milne wrote:
> > Interestingly enough, I think this is the same as what was recently
> > reported here:
> >
> >   https://bugzilla.kernel.org/show_bug.cgi?id=218186
> >
> 
> Yes, from the stack trace and BZ comments it looks like it.

Slightly related, there's nothing stopping a device from reporting a
bogus LBA shift value. We're only checking for "too big", but neglect to
check for too small, and may similiarly shift off to a division by 0. We
might need a check like this too:

---
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index a55c2a774b9c4..8bf78f876f7ac 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1903,7 +1903,7 @@ static void nvme_update_disk_info(struct gendisk *disk,
 	 * The block layer can't support LBA sizes larger than the page size
 	 * yet, so catch this early and don't allow block I/O.
 	 */
-	if (ns->lba_shift > PAGE_SHIFT) {
+	if (ns->lba_shift > PAGE_SHIFT || ns->lba_shift < SECTOR_SHIFT) {
 		capacity = 0;
 		bs = (1 << 9);
 	}
--


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] nvme: check for valid data from from nvme_identify_ns() before using it
  2023-11-28 17:32     ` Keith Busch
@ 2023-11-28 22:57       ` Ewan Milne
  0 siblings, 0 replies; 5+ messages in thread
From: Ewan Milne @ 2023-11-28 22:57 UTC (permalink / raw)
  To: Keith Busch; +Cc: linux-nvme

Sure, the kernel should not crash regardless of what any device reports,
especially an external one.

The _identify_ns() divide by zero happened because the NVMe spec
says a page of zeroes can be returned if a namespace isn't configured.
I'm not really sure why the spec didn't require a nonzero status code.

-Ewan

On Tue, Nov 28, 2023 at 12:32 PM Keith Busch <kbusch@kernel.org> wrote:
>
> On Tue, Nov 28, 2023 at 11:10:29AM -0500, Ewan Milne wrote:
> > > Interestingly enough, I think this is the same as what was recently
> > > reported here:
> > >
> > >   https://bugzilla.kernel.org/show_bug.cgi?id=218186
> > >
> >
> > Yes, from the stack trace and BZ comments it looks like it.
>
> Slightly related, there's nothing stopping a device from reporting a
> bogus LBA shift value. We're only checking for "too big", but neglect to
> check for too small, and may similiarly shift off to a division by 0. We
> might need a check like this too:
>
> ---
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index a55c2a774b9c4..8bf78f876f7ac 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -1903,7 +1903,7 @@ static void nvme_update_disk_info(struct gendisk *disk,
>          * The block layer can't support LBA sizes larger than the page size
>          * yet, so catch this early and don't allow block I/O.
>          */
> -       if (ns->lba_shift > PAGE_SHIFT) {
> +       if (ns->lba_shift > PAGE_SHIFT || ns->lba_shift < SECTOR_SHIFT) {
>                 capacity = 0;
>                 bs = (1 << 9);
>         }
> --
>



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-11-28 22:57 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-27 20:56 [PATCH v2] nvme: check for valid data from from nvme_identify_ns() before using it Ewan D. Milne
2023-11-27 22:03 ` Keith Busch
2023-11-28 16:10   ` Ewan Milne
2023-11-28 17:32     ` Keith Busch
2023-11-28 22:57       ` Ewan Milne

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.