linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] scsi: isci: initialize shost fully before calling scsi_add_host()
@ 2019-01-08 20:50 Logan Gunthorpe
  2019-01-08 21:25 ` Jeff Moyer
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Logan Gunthorpe @ 2019-01-08 20:50 UTC (permalink / raw)
  To: linux-scsi, linux-block, linux-kernel
  Cc: Logan Gunthorpe, Intel SCU Linux support, Artur Paszkiewicz,
	James E.J. Bottomley, Martin K. Petersen, Christoph Hellwig,
	Jens Axboe, Jeff Moyer

scsi_mq_setup_tags(), which is called by scsi_add_host(), calculates
the command size to allocate based on the prot_capabilities. In the
isci driver, scsi_host_set_prot() is called after scsi_add_host()
so the command size gets calculated to be smaller than it needs to be.
Eventually, scsi_mq_init_request() locates the 'prot_sdb' after the command
assuming it was sized correctly and a buffer overrun may occur.

However, seeing blk_mq_alloc_rqs() rounds up to the nearest cache line
size, the mistake can go unnoticed.

The bug was noticed after the struct request size was reduced by
commit 9d037ad707ed ("block: remove req->timeout_list")

Which likely reduced the allocated space for the request by an entire
cache line, enough that the overflow could be hit and it caused a panic,
on boot, at:

  RIP: 0010:t10_pi_complete+0x77/0x1c0
  Call Trace:
    <IRQ>
    sd_done+0xf5/0x340
    scsi_finish_command+0xc3/0x120
    blk_done_softirq+0x83/0xb0
    __do_softirq+0xa1/0x2e6
    irq_exit+0xbc/0xd0
    call_function_single_interrupt+0xf/0x20
    </IRQ>

sd_done() would call scsi_prot_sg_count() which reads the number of
entities in 'prot_sdb', but seeing 'prot_sdb' is located after the end of
the allocated space it reads a garbage number and erroneously calls
t10_pi_complete().

To prevent this, the calls to scsi_host_set_prot() are moved into
isci_host_alloc() before the call to scsi_add_host(). Out of caution,
also move the similar call to scsi_host_set_guard().

Fixes: 3d2d75254915 ("[SCSI] isci: T10 DIF support")
Link: http://lkml.kernel.org/r/da851333-eadd-163a-8c78-e1f4ec5ec857@deltatee.com
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Cc: Intel SCU Linux support <intel-linux-scu@intel.com>
Cc: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jeff Moyer <jmoyer@redhat.com>
---
 drivers/scsi/isci/init.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/isci/init.c b/drivers/scsi/isci/init.c
index 68b90c4f79a3..1727d0c71b12 100644
--- a/drivers/scsi/isci/init.c
+++ b/drivers/scsi/isci/init.c
@@ -576,6 +576,13 @@ static struct isci_host *isci_host_alloc(struct pci_dev *pdev, int id)
 	shost->max_lun = ~0;
 	shost->max_cmd_len = MAX_COMMAND_SIZE;
 
+	/* turn on DIF support */
+	scsi_host_set_prot(shost,
+			   SHOST_DIF_TYPE1_PROTECTION |
+			   SHOST_DIF_TYPE2_PROTECTION |
+			   SHOST_DIF_TYPE3_PROTECTION);
+	scsi_host_set_guard(shost, SHOST_DIX_GUARD_CRC);
+
 	err = scsi_add_host(shost, &pdev->dev);
 	if (err)
 		goto err_shost;
@@ -663,13 +670,6 @@ static int isci_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 			goto err_host_alloc;
 		}
 		pci_info->hosts[i] = h;
-
-		/* turn on DIF support */
-		scsi_host_set_prot(to_shost(h),
-				   SHOST_DIF_TYPE1_PROTECTION |
-				   SHOST_DIF_TYPE2_PROTECTION |
-				   SHOST_DIF_TYPE3_PROTECTION);
-		scsi_host_set_guard(to_shost(h), SHOST_DIX_GUARD_CRC);
 	}
 
 	err = isci_setup_interrupts(pdev);
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] scsi: isci: initialize shost fully before calling scsi_add_host()
  2019-01-08 20:50 [PATCH] scsi: isci: initialize shost fully before calling scsi_add_host() Logan Gunthorpe
@ 2019-01-08 21:25 ` Jeff Moyer
  2019-01-08 21:30 ` Jens Axboe
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 10+ messages in thread
From: Jeff Moyer @ 2019-01-08 21:25 UTC (permalink / raw)
  To: Logan Gunthorpe
  Cc: linux-scsi, linux-block, linux-kernel, Intel SCU Linux support,
	Artur Paszkiewicz, James E.J. Bottomley, Martin K. Petersen,
	Christoph Hellwig, Jens Axboe, james.smart, dick.kennedy

Logan Gunthorpe <logang@deltatee.com> writes:

> scsi_mq_setup_tags(), which is called by scsi_add_host(), calculates
> the command size to allocate based on the prot_capabilities. In the
> isci driver, scsi_host_set_prot() is called after scsi_add_host()
> so the command size gets calculated to be smaller than it needs to be.
> Eventually, scsi_mq_init_request() locates the 'prot_sdb' after the command
> assuming it was sized correctly and a buffer overrun may occur.

[...]

> To prevent this, the calls to scsi_host_set_prot() are moved into
> isci_host_alloc() before the call to scsi_add_host(). Out of caution,
> also move the similar call to scsi_host_set_guard().
>
> Fixes: 3d2d75254915 ("[SCSI] isci: T10 DIF support")
> Link: http://lkml.kernel.org/r/da851333-eadd-163a-8c78-e1f4ec5ec857@deltatee.com
> Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
> Cc: Intel SCU Linux support <intel-linux-scu@intel.com>
> Cc: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
> Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
> Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Jeff Moyer <jmoyer@redhat.com>

Nice job, and excellent commit message.  We'll need a similar patch for
lpfc.

Reviewed-by: Jeff Moyer <jmoyer@redhat.com>

> ---
>  drivers/scsi/isci/init.c | 14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/scsi/isci/init.c b/drivers/scsi/isci/init.c
> index 68b90c4f79a3..1727d0c71b12 100644
> --- a/drivers/scsi/isci/init.c
> +++ b/drivers/scsi/isci/init.c
> @@ -576,6 +576,13 @@ static struct isci_host *isci_host_alloc(struct pci_dev *pdev, int id)
>  	shost->max_lun = ~0;
>  	shost->max_cmd_len = MAX_COMMAND_SIZE;
>  
> +	/* turn on DIF support */
> +	scsi_host_set_prot(shost,
> +			   SHOST_DIF_TYPE1_PROTECTION |
> +			   SHOST_DIF_TYPE2_PROTECTION |
> +			   SHOST_DIF_TYPE3_PROTECTION);
> +	scsi_host_set_guard(shost, SHOST_DIX_GUARD_CRC);
> +
>  	err = scsi_add_host(shost, &pdev->dev);
>  	if (err)
>  		goto err_shost;
> @@ -663,13 +670,6 @@ static int isci_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  			goto err_host_alloc;
>  		}
>  		pci_info->hosts[i] = h;
> -
> -		/* turn on DIF support */
> -		scsi_host_set_prot(to_shost(h),
> -				   SHOST_DIF_TYPE1_PROTECTION |
> -				   SHOST_DIF_TYPE2_PROTECTION |
> -				   SHOST_DIF_TYPE3_PROTECTION);
> -		scsi_host_set_guard(to_shost(h), SHOST_DIX_GUARD_CRC);
>  	}
>  
>  	err = isci_setup_interrupts(pdev);

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] scsi: isci: initialize shost fully before calling scsi_add_host()
  2019-01-08 20:50 [PATCH] scsi: isci: initialize shost fully before calling scsi_add_host() Logan Gunthorpe
  2019-01-08 21:25 ` Jeff Moyer
@ 2019-01-08 21:30 ` Jens Axboe
  2019-01-09  3:29 ` Martin K. Petersen
  2019-01-09 18:41 ` Christoph Hellwig
  3 siblings, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2019-01-08 21:30 UTC (permalink / raw)
  To: Logan Gunthorpe, linux-scsi, linux-block, linux-kernel
  Cc: Intel SCU Linux support, Artur Paszkiewicz, James E.J. Bottomley,
	Martin K. Petersen, Christoph Hellwig, Jeff Moyer

On 1/8/19 1:50 PM, Logan Gunthorpe wrote:
> scsi_mq_setup_tags(), which is called by scsi_add_host(), calculates
> the command size to allocate based on the prot_capabilities. In the
> isci driver, scsi_host_set_prot() is called after scsi_add_host()
> so the command size gets calculated to be smaller than it needs to be.
> Eventually, scsi_mq_init_request() locates the 'prot_sdb' after the command
> assuming it was sized correctly and a buffer overrun may occur.
> 
> However, seeing blk_mq_alloc_rqs() rounds up to the nearest cache line
> size, the mistake can go unnoticed.
> 
> The bug was noticed after the struct request size was reduced by
> commit 9d037ad707ed ("block: remove req->timeout_list")
> 
> Which likely reduced the allocated space for the request by an entire
> cache line, enough that the overflow could be hit and it caused a panic,
> on boot, at:
> 
>   RIP: 0010:t10_pi_complete+0x77/0x1c0
>   Call Trace:
>     <IRQ>
>     sd_done+0xf5/0x340
>     scsi_finish_command+0xc3/0x120
>     blk_done_softirq+0x83/0xb0
>     __do_softirq+0xa1/0x2e6
>     irq_exit+0xbc/0xd0
>     call_function_single_interrupt+0xf/0x20
>     </IRQ>
> 
> sd_done() would call scsi_prot_sg_count() which reads the number of
> entities in 'prot_sdb', but seeing 'prot_sdb' is located after the end of
> the allocated space it reads a garbage number and erroneously calls
> t10_pi_complete().
> 
> To prevent this, the calls to scsi_host_set_prot() are moved into
> isci_host_alloc() before the call to scsi_add_host(). Out of caution,
> also move the similar call to scsi_host_set_guard().

Nice work!

Reviewed-by: Jens Axboe <axboe@kernel.dk>

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] scsi: isci: initialize shost fully before calling scsi_add_host()
  2019-01-08 20:50 [PATCH] scsi: isci: initialize shost fully before calling scsi_add_host() Logan Gunthorpe
  2019-01-08 21:25 ` Jeff Moyer
  2019-01-08 21:30 ` Jens Axboe
@ 2019-01-09  3:29 ` Martin K. Petersen
  2019-01-09 18:41 ` Christoph Hellwig
  3 siblings, 0 replies; 10+ messages in thread
From: Martin K. Petersen @ 2019-01-09  3:29 UTC (permalink / raw)
  To: Logan Gunthorpe
  Cc: linux-scsi, linux-block, linux-kernel, Intel SCU Linux support,
	Artur Paszkiewicz, James E.J. Bottomley, Martin K. Petersen,
	Christoph Hellwig, Jens Axboe, Jeff Moyer


Logan,

> To prevent this, the calls to scsi_host_set_prot() are moved into
> isci_host_alloc() before the call to scsi_add_host(). Out of caution,
> also move the similar call to scsi_host_set_guard().

Applied to 5.0/scsi-fixes. Thanks much!

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] scsi: isci: initialize shost fully before calling scsi_add_host()
  2019-01-08 20:50 [PATCH] scsi: isci: initialize shost fully before calling scsi_add_host() Logan Gunthorpe
                   ` (2 preceding siblings ...)
  2019-01-09  3:29 ` Martin K. Petersen
@ 2019-01-09 18:41 ` Christoph Hellwig
  2019-01-10  9:11   ` John Garry
  3 siblings, 1 reply; 10+ messages in thread
From: Christoph Hellwig @ 2019-01-09 18:41 UTC (permalink / raw)
  To: Logan Gunthorpe
  Cc: linux-scsi, linux-block, linux-kernel, Intel SCU Linux support,
	Artur Paszkiewicz, James E.J. Bottomley, Martin K. Petersen,
	Christoph Hellwig, Jens Axboe, Jeff Moyer

This looks good.  I wonder if there is any good way to prevent other
drivers from picking up this bug byt using a better interface, but
that should not delay your fix.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] scsi: isci: initialize shost fully before calling scsi_add_host()
  2019-01-09 18:41 ` Christoph Hellwig
@ 2019-01-10  9:11   ` John Garry
  2019-01-12  2:34     ` Martin K. Petersen
  0 siblings, 1 reply; 10+ messages in thread
From: John Garry @ 2019-01-10  9:11 UTC (permalink / raw)
  To: Christoph Hellwig, Logan Gunthorpe
  Cc: linux-scsi, linux-block, linux-kernel, Intel SCU Linux support,
	Artur Paszkiewicz, James E.J. Bottomley, Martin K. Petersen,
	Jens Axboe, Jeff Moyer, chenxiang

On 09/01/2019 18:41, Christoph Hellwig wrote:
> This looks good.  I wonder if there is any good way to prevent other
> drivers from picking up this bug byt using a better interface, but
> that should not delay your fix.
>
> .
>

I noticed that hisi_sas has this same problem but I forgot to fix it.

So how about just drop these APIs and let the user set the shost 
protection parameters directly, like other shost parameters, which 
should make it a bit clearer when these should be set, i.e. before 
scsi_add_host()?

John



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] scsi: isci: initialize shost fully before calling scsi_add_host()
  2019-01-10  9:11   ` John Garry
@ 2019-01-12  2:34     ` Martin K. Petersen
  2019-01-14 12:10       ` John Garry
  0 siblings, 1 reply; 10+ messages in thread
From: Martin K. Petersen @ 2019-01-12  2:34 UTC (permalink / raw)
  To: John Garry
  Cc: Christoph Hellwig, Logan Gunthorpe, linux-scsi, linux-block,
	linux-kernel, Intel SCU Linux support, Artur Paszkiewicz,
	James E.J. Bottomley, Martin K. Petersen, Jens Axboe, Jeff Moyer,
	chenxiang


John,

> So how about just drop these APIs and let the user set the shost
> protection parameters directly, like other shost parameters,

The protection interfaces here obviously predate the block layer
allocation changes that made this particular issue pop up.

> which should make it a bit clearer when these should be set,
> i.e. before scsi_add_host()?

In general, I am not so keen on the somewhat messy intersection between
the host parameters and the host template. The static host templates
made lots of sense in the days of Seagate ST01 and fixed hardware
capabilities.  But reality is that most modern controllers have to query
firmware interfaces to figure out what their actual capabilities are.

So in this case I think that accessor functions are actually better
because they allow us to print a big fat warning when you twiddle
something you shouldn't post-initialization. So that's something I think
we could--and should--improve.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] scsi: isci: initialize shost fully before calling scsi_add_host()
  2019-01-12  2:34     ` Martin K. Petersen
@ 2019-01-14 12:10       ` John Garry
  2019-01-16  2:54         ` Martin K. Petersen
  0 siblings, 1 reply; 10+ messages in thread
From: John Garry @ 2019-01-14 12:10 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Christoph Hellwig, Logan Gunthorpe, linux-scsi, linux-block,
	linux-kernel, Intel SCU Linux support, Artur Paszkiewicz,
	James E.J. Bottomley, Jens Axboe, Jeff Moyer, chenxiang

On 12/01/2019 02:34, Martin K. Petersen wrote:
>
> John,
>
>> So how about just drop these APIs and let the user set the shost
>> protection parameters directly, like other shost parameters,
>
> The protection interfaces here obviously predate the block layer
> allocation changes that made this particular issue pop up.
>
>> which should make it a bit clearer when these should be set,
>> i.e. before scsi_add_host()?
>
> In general, I am not so keen on the somewhat messy intersection between
> the host parameters and the host template. The static host templates
> made lots of sense in the days of Seagate ST01 and fixed hardware
> capabilities.  But reality is that most modern controllers have to query
> firmware interfaces to figure out what their actual capabilities are.

Hi Martin,

I am not suggested setting the parameters via scsi host template, but 
rather dynamically (as we currently do) but just drop the set helper 
functions, like:

     shost->max_channel = 1;
     shost->max_cmd_len = 16;

     ...

     if (hisi_hba->prot_mask) {
         dev_info(dev, "Registering for DIF/DIX prot_mask=0x%x\n",
              prot_mask);
-        scsi_host_set_prot(hisi_hba->shost, prot_mask);
+        shost->prot_capabilities = prot_mask;
     }

     rc = scsi_add_host(shost, dev);
     if (rc)
         goto err_out_ha;

     rc = sas_register_ha(sha);
     if (rc)
         goto err_out_register_ha;

I find that it is not crystal clear when scsi_host_set_prot() and 
scsi_host_set_guard() should be called, but not so for setting the shost 
parameters directly, which is common.

>
> So in this case I think that accessor functions are actually better
> because they allow us to print a big fat warning when you twiddle
> something you shouldn't post-initialization. So that's something I think
> we could--and should--improve.
>

Sure, this is an alternative, but I would rather make it obvious when 
these parameters should be set so that this would not be required.

Thanks,
John



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] scsi: isci: initialize shost fully before calling scsi_add_host()
  2019-01-14 12:10       ` John Garry
@ 2019-01-16  2:54         ` Martin K. Petersen
  2019-01-16 14:44           ` John Garry
  0 siblings, 1 reply; 10+ messages in thread
From: Martin K. Petersen @ 2019-01-16  2:54 UTC (permalink / raw)
  To: John Garry
  Cc: Martin K. Petersen, Christoph Hellwig, Logan Gunthorpe,
	linux-scsi, linux-block, linux-kernel, Intel SCU Linux support,
	Artur Paszkiewicz, James E.J. Bottomley, Jens Axboe, Jeff Moyer,
	chenxiang


Hi John,

>> So in this case I think that accessor functions are actually better
>> because they allow us to print a big fat warning when you twiddle
>> something you shouldn't post-initialization. So that's something I think
>> we could--and should--improve.
>>
> Sure, this is an alternative, but I would rather make it obvious when
> these parameters should be set so that this would not be required.

I would like to have a mechanism in place that warns if you twiddle
things that break assumptions made at host registration time. That's not
a scenario the old registration interface was designed to handle.

I am not sure I agree with your assertion that setting these masks in
the struct prior to scsi_add_host() makes this ordering requirement much
more obvious. It is not like you are passing in a list of parameters and
then receiving a separately instantiated immutable scsi_host object. You
are performing an operation on something you already have and own.

That's why I commented that the current intersection between
compile-time static host template, dynamically discovered
pre-registration scsi_host parameters, and the registered runtime
scsi_host struct is somewhat messy.

Btw. I have no attachment to the prot wrappers whatsoever. The reason
they exist is that the SCSI integrity support was optional. And
therefore we had stub functions so things could be compiled without any
of the integrity fields being present in the various SCSI structs. So I
don't have any problem killing the wrappers except they may actually be
handy for regressions like this one where you could #error if the driver
writer violates the ordering requirement.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] scsi: isci: initialize shost fully before calling scsi_add_host()
  2019-01-16  2:54         ` Martin K. Petersen
@ 2019-01-16 14:44           ` John Garry
  0 siblings, 0 replies; 10+ messages in thread
From: John Garry @ 2019-01-16 14:44 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Christoph Hellwig, Logan Gunthorpe, linux-scsi, linux-block,
	linux-kernel, Intel SCU Linux support, Artur Paszkiewicz,
	James E.J. Bottomley, Jens Axboe, Jeff Moyer, chenxiang

On 16/01/2019 02:54, Martin K. Petersen wrote:
>
> Hi John,
>

Hi Martin,

>>> So in this case I think that accessor functions are actually better
>>> because they allow us to print a big fat warning when you twiddle
>>> something you shouldn't post-initialization. So that's something I think
>>> we could--and should--improve.
>>>
>> Sure, this is an alternative, but I would rather make it obvious when
>> these parameters should be set so that this would not be required.
>
> I would like to have a mechanism in place that warns if you twiddle
> things that break assumptions made at host registration time.

Yes, something more robust would be good.

That's not
> a scenario the old registration interface was designed to handle.
>
> I am not sure I agree with your assertion that setting these masks in
> the struct prior to scsi_add_host() makes this ordering requirement much
> more obvious.
It is not like you are passing in a list of parameters and
> then receiving a separately instantiated immutable scsi_host object. You
> are performing an operation on something you already have and own.
>
> That's why I commented that the current intersection between
> compile-time static host template, dynamically discovered
> pre-registration scsi_host parameters, and the registered runtime
> scsi_host struct is somewhat messy.
>
> Btw. I have no attachment to the prot wrappers whatsoever. The reason
> they exist is that the SCSI integrity support was optional. And
> therefore we had stub functions so things could be compiled without any
> of the integrity fields being present in the various SCSI structs.

I never noticed stubs for setting/getting 
Scsi_host.prot_{capabilities,guard_type}

So I
> don't have any problem killing the wrappers except they may actually be
> handy for regressions like this one where you could #error if the driver
> writer violates the ordering requirement.
>

We set many Scsi_host parameters without such safeguarding, and I don't 
know what's special about these protection-related members.

Thanks,
John



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-01-16 14:45 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-08 20:50 [PATCH] scsi: isci: initialize shost fully before calling scsi_add_host() Logan Gunthorpe
2019-01-08 21:25 ` Jeff Moyer
2019-01-08 21:30 ` Jens Axboe
2019-01-09  3:29 ` Martin K. Petersen
2019-01-09 18:41 ` Christoph Hellwig
2019-01-10  9:11   ` John Garry
2019-01-12  2:34     ` Martin K. Petersen
2019-01-14 12:10       ` John Garry
2019-01-16  2:54         ` Martin K. Petersen
2019-01-16 14:44           ` John Garry

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).