linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4] qla2xxx: Fix unbound NVME response length
@ 2020-01-21 19:27 Himanshu Madhani
  2020-01-22 21:23 ` Ewan D. Milne
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Himanshu Madhani @ 2020-01-21 19:27 UTC (permalink / raw)
  To: James.Bottomley, martin.petersen; +Cc: hmadhani, linux-scsi

From: Arun Easi <aeasi@marvell.com>

On certain cases when response length is less than 32, NVME response data
is supplied inline in IOCB. This is indicated by some combination of state
flags. There was an instance when a high, and incorrect, response length was
indicated causing driver to overrun buffers. Fix this by checking and
limiting the response payload length.

Fixes: 7401bc18d1ee3 ("scsi: qla2xxx: Add FC-NVMe command handling")
Cc: stable@vger.kernel.org
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
---
Hi Martin,

We discovered issue with our newer Gen7 adapter when response length
happens to be larger than 32 bytes, could result into crash.

Please apply this to 5.5/scsi-fixes branch at your earliest convenience.

Changes from v3 -> v4

o use "sizeof(struct nvme_fc_ersp_iu)" in missed place.

Changes from v2 -> v3

o Use "sizeof(struct nvme_fc_ersp_iu)" to indicate response payload size.

Changes from v1 -> v2

o Fixed the tag for stable.
o Removed logit which got spilled from other patch to prevent compile failure.

Thanks,
Himanshu
---
 drivers/scsi/qla2xxx/qla_isr.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
index e7bad0bfffda..4caec94d8e99 100644
--- a/drivers/scsi/qla2xxx/qla_isr.c
+++ b/drivers/scsi/qla2xxx/qla_isr.c
@@ -1939,6 +1939,16 @@ static void qla24xx_nvme_iocb_entry(scsi_qla_host_t *vha, struct req_que *req,
 		inbuf = (uint32_t *)&sts->nvme_ersp_data;
 		outbuf = (uint32_t *)fd->rspaddr;
 		iocb->u.nvme.rsp_pyld_len = le16_to_cpu(sts->nvme_rsp_pyld_len);
+		if (unlikely(iocb->u.nvme.rsp_pyld_len >
+		    sizeof(struct nvme_fc_ersp_iu))) {
+			WARN_ONCE(1, "Unexpected response payload length %u.\n",
+			    iocb->u.nvme.rsp_pyld_len);
+			ql_log(ql_log_warn, fcport->vha, 0x5100,
+			    "Unexpected response payload length %u.\n",
+			    iocb->u.nvme.rsp_pyld_len);
+			iocb->u.nvme.rsp_pyld_len =
+			    sizeof(struct nvme_fc_ersp_iu);
+		}
 		iter = iocb->u.nvme.rsp_pyld_len >> 2;
 		for (; iter; iter--)
 			*outbuf++ = swab32(*inbuf++);
-- 
2.12.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v4] qla2xxx: Fix unbound NVME response length
  2020-01-21 19:27 [PATCH v4] qla2xxx: Fix unbound NVME response length Himanshu Madhani
@ 2020-01-22 21:23 ` Ewan D. Milne
  2020-01-22 23:18 ` Martin K. Petersen
  2020-01-22 23:59 ` Elliott, Robert (Servers)
  2 siblings, 0 replies; 8+ messages in thread
From: Ewan D. Milne @ 2020-01-22 21:23 UTC (permalink / raw)
  To: Himanshu Madhani, James.Bottomley, martin.petersen; +Cc: linux-scsi

On Tue, 2020-01-21 at 11:27 -0800, Himanshu Madhani wrote:
> From: Arun Easi <aeasi@marvell.com>
> 
> On certain cases when response length is less than 32, NVME response data
> is supplied inline in IOCB. This is indicated by some combination of state
> flags. There was an instance when a high, and incorrect, response length was
> indicated causing driver to overrun buffers. Fix this by checking and
> limiting the response payload length.
> 
> Fixes: 7401bc18d1ee3 ("scsi: qla2xxx: Add FC-NVMe command handling")
> Cc: stable@vger.kernel.org
> Signed-off-by: Arun Easi <aeasi@marvell.com>
> Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
> ---
> Hi Martin,
> 
> We discovered issue with our newer Gen7 adapter when response length
> happens to be larger than 32 bytes, could result into crash.
> 
> Please apply this to 5.5/scsi-fixes branch at your earliest convenience.
> 
> Changes from v3 -> v4
> 
> o use "sizeof(struct nvme_fc_ersp_iu)" in missed place.
> 
> Changes from v2 -> v3
> 
> o Use "sizeof(struct nvme_fc_ersp_iu)" to indicate response payload size.
> 
> Changes from v1 -> v2
> 
> o Fixed the tag for stable.
> o Removed logit which got spilled from other patch to prevent compile failure.
> 
> Thanks,
> Himanshu
> ---
>  drivers/scsi/qla2xxx/qla_isr.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
> index e7bad0bfffda..4caec94d8e99 100644
> --- a/drivers/scsi/qla2xxx/qla_isr.c
> +++ b/drivers/scsi/qla2xxx/qla_isr.c
> @@ -1939,6 +1939,16 @@ static void qla24xx_nvme_iocb_entry(scsi_qla_host_t *vha, struct req_que *req,
>  		inbuf = (uint32_t *)&sts->nvme_ersp_data;
>  		outbuf = (uint32_t *)fd->rspaddr;
>  		iocb->u.nvme.rsp_pyld_len = le16_to_cpu(sts->nvme_rsp_pyld_len);
> +		if (unlikely(iocb->u.nvme.rsp_pyld_len >
> +		    sizeof(struct nvme_fc_ersp_iu))) {
> +			WARN_ONCE(1, "Unexpected response payload length %u.\n",
> +			    iocb->u.nvme.rsp_pyld_len);
> +			ql_log(ql_log_warn, fcport->vha, 0x5100,
> +			    "Unexpected response payload length %u.\n",
> +			    iocb->u.nvme.rsp_pyld_len);
> +			iocb->u.nvme.rsp_pyld_len =
> +			    sizeof(struct nvme_fc_ersp_iu);
> +		}
>  		iter = iocb->u.nvme.rsp_pyld_len >> 2;
>  		for (; iter; iter--)
>  			*outbuf++ = swab32(*inbuf++);

Thanks.

Reviewed-by: Ewan D. Milne <emilne@redhat.com>



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4] qla2xxx: Fix unbound NVME response length
  2020-01-21 19:27 [PATCH v4] qla2xxx: Fix unbound NVME response length Himanshu Madhani
  2020-01-22 21:23 ` Ewan D. Milne
@ 2020-01-22 23:18 ` Martin K. Petersen
  2020-01-22 23:59 ` Elliott, Robert (Servers)
  2 siblings, 0 replies; 8+ messages in thread
From: Martin K. Petersen @ 2020-01-22 23:18 UTC (permalink / raw)
  To: Himanshu Madhani; +Cc: James.Bottomley, martin.petersen, linux-scsi


Himanshu,

> On certain cases when response length is less than 32, NVME response
> data is supplied inline in IOCB. This is indicated by some combination
> of state flags. There was an instance when a high, and incorrect,
> response length was indicated causing driver to overrun buffers. Fix
> this by checking and limiting the response payload length.

Applied to 5.5/scsi-fixes, thanks!

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH v4] qla2xxx: Fix unbound NVME response length
  2020-01-21 19:27 [PATCH v4] qla2xxx: Fix unbound NVME response length Himanshu Madhani
  2020-01-22 21:23 ` Ewan D. Milne
  2020-01-22 23:18 ` Martin K. Petersen
@ 2020-01-22 23:59 ` Elliott, Robert (Servers)
  2020-01-23  0:20   ` Arun Easi
  2 siblings, 1 reply; 8+ messages in thread
From: Elliott, Robert (Servers) @ 2020-01-22 23:59 UTC (permalink / raw)
  To: Himanshu Madhani, James.Bottomley, martin.petersen; +Cc: linux-scsi



> -----Original Message-----
> From: linux-scsi-owner@vger.kernel.org <linux-scsi-owner@vger.kernel.org>
> On Behalf Of Himanshu Madhani
> Sent: Tuesday, January 21, 2020 1:27 PM
> Subject: [PATCH v4] qla2xxx: Fix unbound NVME response length
...
> We discovered issue with our newer Gen7 adapter when response length
> happens to be larger than 32 bytes, could result into crash.
...
>  drivers/scsi/qla2xxx/qla_isr.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/scsi/qla2xxx/qla_isr.c
...
> +		if (unlikely(iocb->u.nvme.rsp_pyld_len >
> +		    sizeof(struct nvme_fc_ersp_iu))) {
> +			WARN_ONCE(1, "Unexpected response payload length %u.\n",
> +			    iocb->u.nvme.rsp_pyld_len);

Do you really need a kernel stack dump for this error, which the WARN
macros create? The problem would be caused by firmware behavior, not
something wrong in the kernel.

If this function runs in interrupt context (based on the filename),
then printing lots of data to the slow serial port can cause soft
lockups and other issues.

> +			ql_log(ql_log_warn, fcport->vha, 0x5100,
> +			    "Unexpected response payload length %u.\n",
> +			    iocb->u.nvme.rsp_pyld_len);
> +			iocb->u.nvme.rsp_pyld_len =
> +			    sizeof(struct nvme_fc_ersp_iu);
> +		}

If the problem is due to some firmware incompatibility and every
response is long, the kernel log will quickly become full of
these messages - per-IO prints are noisy. The handling implies
the driver thinks it's safe to proceed, so there's nothing that
is going to keep the problem from reoccurring. If the handling was
to report a failed IO and shut down the device, then the number
of possible error messages would quickly cease.

Safer approaches would be to print only once and maintain a count
of errors in sysfs, or use ratelimited print functions.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH v4] qla2xxx: Fix unbound NVME response length
  2020-01-22 23:59 ` Elliott, Robert (Servers)
@ 2020-01-23  0:20   ` Arun Easi
  2020-01-23  2:06     ` Martin K. Petersen
  2020-01-23 23:21     ` Ewan D. Milne
  0 siblings, 2 replies; 8+ messages in thread
From: Arun Easi @ 2020-01-23  0:20 UTC (permalink / raw)
  To: Elliott, Robert (Servers)
  Cc: Himanshu Madhani, James.Bottomley, martin.petersen, linux-scsi

Thanks for the review, Robert. Response inline..

On Wed, 22 Jan 2020, 3:59pm, Elliott, Robert (Servers) wrote:

> 
> 
> > -----Original Message-----
> > From: linux-scsi-owner@vger.kernel.org <linux-scsi-owner@vger.kernel.org>
> > On Behalf Of Himanshu Madhani
> > Sent: Tuesday, January 21, 2020 1:27 PM
> > Subject: [PATCH v4] qla2xxx: Fix unbound NVME response length
> ...
> > We discovered issue with our newer Gen7 adapter when response length
> > happens to be larger than 32 bytes, could result into crash.
> ...
> >  drivers/scsi/qla2xxx/qla_isr.c | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> > 
> > diff --git a/drivers/scsi/qla2xxx/qla_isr.c
> ...
> > +		if (unlikely(iocb->u.nvme.rsp_pyld_len >
> > +		    sizeof(struct nvme_fc_ersp_iu))) {
> > +			WARN_ONCE(1, "Unexpected response payload length %u.\n",
> > +			    iocb->u.nvme.rsp_pyld_len);
> 
> Do you really need a kernel stack dump for this error, which the WARN
> macros create? The problem would be caused by firmware behavior, not
> something wrong in the kernel.

The intent was to bring this to the tester's notice. My expectation is 
that this would be removed once the root cause is known. The issue was not 
reproducible internally.

> 
> If this function runs in interrupt context (based on the filename),
> then printing lots of data to the slow serial port can cause soft
> lockups and other issues.

In retrospect, this should have been under the driver debug tunable (which 
is set usually by testers).

> 
> > +			ql_log(ql_log_warn, fcport->vha, 0x5100,
> > +			    "Unexpected response payload length %u.\n",
> > +			    iocb->u.nvme.rsp_pyld_len);
> > +			iocb->u.nvme.rsp_pyld_len =
> > +			    sizeof(struct nvme_fc_ersp_iu);
> > +		}
> 
> If the problem is due to some firmware incompatibility and every
> response is long, the kernel log will quickly become full of
> these messages - per-IO prints are noisy. The handling implies
> the driver thinks it's safe to proceed, so there's nothing that
> is going to keep the problem from reoccurring. If the handling was
> to report a failed IO and shut down the device, then the number
> of possible error messages would quickly cease.
> 
> Safer approaches would be to print only once and maintain a count
> of errors in sysfs, or use ratelimited print functions.

I can post a follow on patch, with the WARN/log message under driver 
debug.

Regards,
-Arun

> 
> 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4] qla2xxx: Fix unbound NVME response length
  2020-01-23  0:20   ` Arun Easi
@ 2020-01-23  2:06     ` Martin K. Petersen
  2020-01-23 22:51       ` [EXT] " Arun Easi
  2020-01-23 23:21     ` Ewan D. Milne
  1 sibling, 1 reply; 8+ messages in thread
From: Martin K. Petersen @ 2020-01-23  2:06 UTC (permalink / raw)
  To: Arun Easi
  Cc: Elliott, Robert (Servers),
	Himanshu Madhani, James.Bottomley, martin.petersen, linux-scsi


Arun,

> I can post a follow on patch, with the WARN/log message under driver 
> debug.

Just send a v5. Thanks!

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [EXT] Re: [PATCH v4] qla2xxx: Fix unbound NVME response length
  2020-01-23  2:06     ` Martin K. Petersen
@ 2020-01-23 22:51       ` Arun Easi
  0 siblings, 0 replies; 8+ messages in thread
From: Arun Easi @ 2020-01-23 22:51 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Elliott, Robert (Servers), Himanshu Madhani, James.Bottomley, linux-scsi

On Wed, 22 Jan 2020, 6:06pm, Martin K. Petersen wrote:

> Arun,
> 
> > I can post a follow on patch, with the WARN/log message under driver 
> > debug.
> 
> Just send a v5. Thanks!
> 

Will do. Thanks Martin.

Regards,
-Arun

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4] qla2xxx: Fix unbound NVME response length
  2020-01-23  0:20   ` Arun Easi
  2020-01-23  2:06     ` Martin K. Petersen
@ 2020-01-23 23:21     ` Ewan D. Milne
  1 sibling, 0 replies; 8+ messages in thread
From: Ewan D. Milne @ 2020-01-23 23:21 UTC (permalink / raw)
  To: Arun Easi, Elliott, Robert (Servers)
  Cc: Himanshu Madhani, James.Bottomley, martin.petersen, linux-scsi

On Wed, 2020-01-22 at 16:20 -0800, Arun Easi wrote:
> Thanks for the review, Robert. Response inline..
> 
> On Wed, 22 Jan 2020, 3:59pm, Elliott, Robert (Servers) wrote:
> 
> > 
> > > -----Original Message-----
> > > From: linux-scsi-owner@vger.kernel.org <linux-scsi-owner@vger.kernel.org>
> > > On Behalf Of Himanshu Madhani
> > > Sent: Tuesday, January 21, 2020 1:27 PM
> > > Subject: [PATCH v4] qla2xxx: Fix unbound NVME response length
> > ...
> > > We discovered issue with our newer Gen7 adapter when response length
> > > happens to be larger than 32 bytes, could result into crash.
> > ...
> > >  drivers/scsi/qla2xxx/qla_isr.c | 10 ++++++++++
> > >  1 file changed, 10 insertions(+)
> > > 
> > > diff --git a/drivers/scsi/qla2xxx/qla_isr.c
> > ...
> > > +		if (unlikely(iocb->u.nvme.rsp_pyld_len >
> > > +		    sizeof(struct nvme_fc_ersp_iu))) {
> > > +			WARN_ONCE(1, "Unexpected response payload length %u.\n",
> > > +			    iocb->u.nvme.rsp_pyld_len);
> > 
> > Do you really need a kernel stack dump for this error, which the WARN
> > macros create? The problem would be caused by firmware behavior, not
> > something wrong in the kernel.
> 
> The intent was to bring this to the tester's notice. My expectation is 
> that this would be removed once the root cause is known. The issue was not 
> reproducible internally.

We have a reproducible test case, so testing the patch was
not a problem.  I agree the log message should be restricted
or suppressed though, once would be enough.

The problem appeared to be that an extra bit was set (because
the length was too long by 16K) and our testing worked OK with
the ersp_iu data truncated to the correct structure size.

-Ewan

> 
> > If this function runs in interrupt context (based on the filename),
> > then printing lots of data to the slow serial port can cause soft
> > lockups and other issues.
> 
> In retrospect, this should have been under the driver debug tunable (which 
> is set usually by testers).
> 
> > > +			ql_log(ql_log_warn, fcport->vha, 0x5100,
> > > +			    "Unexpected response payload length %u.\n",
> > > +			    iocb->u.nvme.rsp_pyld_len);
> > > +			iocb->u.nvme.rsp_pyld_len =
> > > +			    sizeof(struct nvme_fc_ersp_iu);
> > > +		}
> > 
> > If the problem is due to some firmware incompatibility and every
> > response is long, the kernel log will quickly become full of
> > these messages - per-IO prints are noisy. The handling implies
> > the driver thinks it's safe to proceed, so there's nothing that
> > is going to keep the problem from reoccurring. If the handling was
> > to report a failed IO and shut down the device, then the number
> > of possible error messages would quickly cease.
> > 
> > Safer approaches would be to print only once and maintain a count
> > of errors in sysfs, or use ratelimited print functions.
> 
> I can post a follow on patch, with the WARN/log message under driver 
> debug.
> 
> Regards,
> -Arun
> 
> > 
> > 


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-01-23 23:21 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-21 19:27 [PATCH v4] qla2xxx: Fix unbound NVME response length Himanshu Madhani
2020-01-22 21:23 ` Ewan D. Milne
2020-01-22 23:18 ` Martin K. Petersen
2020-01-22 23:59 ` Elliott, Robert (Servers)
2020-01-23  0:20   ` Arun Easi
2020-01-23  2:06     ` Martin K. Petersen
2020-01-23 22:51       ` [EXT] " Arun Easi
2020-01-23 23:21     ` Ewan D. Milne

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).