All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] qla2xxx: Two bug fixes
@ 2017-01-23 16:34 Bart Van Assche
  2017-01-23 16:34   ` Bart Van Assche
  2017-01-23 16:34   ` Bart Van Assche
  0 siblings, 2 replies; 25+ messages in thread
From: Bart Van Assche @ 2017-01-23 16:34 UTC (permalink / raw)
  To: Martin K . Petersen; +Cc: linux-scsi, Bart Van Assche

Hello Martin,

While testing kernel v4.10-rc5 I ran into two bugs in the qla2xxx driver.
The two patches in this series fixes these bugs. Please consider these
patches for inclusion in the upstream kernel.

Bart Van Assche (2):
  qla2xxx: Fix a recently introduced memory leak
  qla2xxx: Avoid that issuing a LIP triggers a kernel crash

 drivers/scsi/qla2xxx/qla_isr.c | 3 ++-
 drivers/scsi/qla2xxx/qla_os.c  | 6 +++++-
 2 files changed, 7 insertions(+), 2 deletions(-)

-- 
2.11.0


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/2] qla2xxx: Fix a recently introduced memory leak
  2017-01-23 16:34 [PATCH 0/2] qla2xxx: Two bug fixes Bart Van Assche
@ 2017-01-23 16:34   ` Bart Van Assche
  2017-01-23 16:34   ` Bart Van Assche
  1 sibling, 0 replies; 25+ messages in thread
From: Bart Van Assche @ 2017-01-23 16:34 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: linux-scsi, Bart Van Assche, Michael Hernandez, Himanshu Madhani,
	Christoph Hellwig, stable

qla2x00_probe_one() allocates IRQs before it initializes rsp_q_map
so IRQs must be freed even if rsp_q_map allocation did not occur.
This was detected by kmemleak.

Fixes: 4fa183455988 ("scsi: qla2xxx: Utilize pci_alloc_irq_vectors/pci_free_irq_vectors calls")
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Michael Hernandez <michael.hernandez@cavium.com>
Cc: Himanshu Madhani <himanshu.madhani@cavium.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: <stable@vger.kernel.org>
---
 drivers/scsi/qla2xxx/qla_isr.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
index dc88a09f9043..a94b0b6bd030 100644
--- a/drivers/scsi/qla2xxx/qla_isr.c
+++ b/drivers/scsi/qla2xxx/qla_isr.c
@@ -3242,7 +3242,7 @@ qla2x00_free_irqs(scsi_qla_host_t *vha)
 	 * from a probe failure context.
 	 */
 	if (!ha->rsp_q_map || !ha->rsp_q_map[0])
-		return;
+		goto free_irqs;
 	rsp = ha->rsp_q_map[0];
 
 	if (ha->flags.msix_enabled) {
@@ -3262,6 +3262,7 @@ qla2x00_free_irqs(scsi_qla_host_t *vha)
 		free_irq(pci_irq_vector(ha->pdev, 0), rsp);
 	}
 
+free_irqs:
 	pci_free_irq_vectors(ha->pdev);
 }
 
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 1/2] qla2xxx: Fix a recently introduced memory leak
@ 2017-01-23 16:34   ` Bart Van Assche
  0 siblings, 0 replies; 25+ messages in thread
From: Bart Van Assche @ 2017-01-23 16:34 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: linux-scsi, Bart Van Assche, Michael Hernandez, Himanshu Madhani,
	Christoph Hellwig, stable

qla2x00_probe_one() allocates IRQs before it initializes rsp_q_map
so IRQs must be freed even if rsp_q_map allocation did not occur.
This was detected by kmemleak.

Fixes: 4fa183455988 ("scsi: qla2xxx: Utilize pci_alloc_irq_vectors/pci_free_irq_vectors calls")
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Michael Hernandez <michael.hernandez@cavium.com>
Cc: Himanshu Madhani <himanshu.madhani@cavium.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: <stable@vger.kernel.org>
---
 drivers/scsi/qla2xxx/qla_isr.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
index dc88a09f9043..a94b0b6bd030 100644
--- a/drivers/scsi/qla2xxx/qla_isr.c
+++ b/drivers/scsi/qla2xxx/qla_isr.c
@@ -3242,7 +3242,7 @@ qla2x00_free_irqs(scsi_qla_host_t *vha)
 	 * from a probe failure context.
 	 */
 	if (!ha->rsp_q_map || !ha->rsp_q_map[0])
-		return;
+		goto free_irqs;
 	rsp = ha->rsp_q_map[0];
 
 	if (ha->flags.msix_enabled) {
@@ -3262,6 +3262,7 @@ qla2x00_free_irqs(scsi_qla_host_t *vha)
 		free_irq(pci_irq_vector(ha->pdev, 0), rsp);
 	}
 
+free_irqs:
 	pci_free_irq_vectors(ha->pdev);
 }
 
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/2] qla2xxx: Avoid that issuing a LIP triggers a kernel crash
  2017-01-23 16:34 [PATCH 0/2] qla2xxx: Two bug fixes Bart Van Assche
@ 2017-01-23 16:34   ` Bart Van Assche
  2017-01-23 16:34   ` Bart Van Assche
  1 sibling, 0 replies; 25+ messages in thread
From: Bart Van Assche @ 2017-01-23 16:34 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: linux-scsi, Bart Van Assche, Naresh Bannoth,
	Mauricio Faria de Oliveira, Himanshu Madhani, stable

Avoid that issuing a LIP as follows:

  find /sys -name 'issue_lip'|while read f; do echo 1 > $f; done

triggers the following:

BUG: unable to handle kernel NULL pointer dereference at (null)
Call Trace:
 qla2x00_abort_all_cmds+0xed/0x140 [qla2xxx]
 qla2x00_abort_isp_cleanup+0x1e3/0x280 [qla2xxx]
 qla2x00_abort_isp+0xef/0x690 [qla2xxx]
 qla2x00_do_dpc+0x36c/0x880 [qla2xxx]
 kthread+0x10c/0x140

Fixes: 1535aa75a3d8 ("qla2xxx: fix invalid DMA access after command aborts in PCI device remove")
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Naresh Bannoth <nbannoth@in.ibm.com>
Cc: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Cc: Himanshu Madhani <himanshu.madhani@cavium.com>
Cc: <stable@vger.kernel.org>
---
 drivers/scsi/qla2xxx/qla_os.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index 0a000ecf0881..ae9c5a7b239a 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -1600,6 +1600,7 @@ qla2x00_abort_all_cmds(scsi_qla_host_t *vha, int res)
 	srb_t *sp;
 	struct qla_hw_data *ha = vha->hw;
 	struct req_que *req;
+	struct scsi_cmnd *scmd;
 
 	qlt_host_reset_handler(ha);
 
@@ -1613,6 +1614,8 @@ qla2x00_abort_all_cmds(scsi_qla_host_t *vha, int res)
 		for (cnt = 1; cnt < req->num_outstanding_cmds; cnt++) {
 			sp = req->outstanding_cmds[cnt];
 			if (sp) {
+				scmd = GET_CMD_SP(sp);
+
 				/* Don't abort commands in adapter during EEH
 				 * recovery as it's not accessible/responding.
 				 */
@@ -1624,7 +1627,8 @@ qla2x00_abort_all_cmds(scsi_qla_host_t *vha, int res)
 					 */
 					sp_get(sp);
 					spin_unlock_irqrestore(&ha->hardware_lock, flags);
-					qla2xxx_eh_abort(GET_CMD_SP(sp));
+					if (scmd)
+						qla2xxx_eh_abort(scmd);
 					spin_lock_irqsave(&ha->hardware_lock, flags);
 				}
 				req->outstanding_cmds[cnt] = NULL;
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/2] qla2xxx: Avoid that issuing a LIP triggers a kernel crash
@ 2017-01-23 16:34   ` Bart Van Assche
  0 siblings, 0 replies; 25+ messages in thread
From: Bart Van Assche @ 2017-01-23 16:34 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: linux-scsi, Bart Van Assche, Naresh Bannoth,
	Mauricio Faria de Oliveira, Himanshu Madhani, stable

Avoid that issuing a LIP as follows:

  find /sys -name 'issue_lip'|while read f; do echo 1 > $f; done

triggers the following:

BUG: unable to handle kernel NULL pointer dereference at (null)
Call Trace:
 qla2x00_abort_all_cmds+0xed/0x140 [qla2xxx]
 qla2x00_abort_isp_cleanup+0x1e3/0x280 [qla2xxx]
 qla2x00_abort_isp+0xef/0x690 [qla2xxx]
 qla2x00_do_dpc+0x36c/0x880 [qla2xxx]
 kthread+0x10c/0x140

Fixes: 1535aa75a3d8 ("qla2xxx: fix invalid DMA access after command aborts in PCI device remove")
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Naresh Bannoth <nbannoth@in.ibm.com>
Cc: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Cc: Himanshu Madhani <himanshu.madhani@cavium.com>
Cc: <stable@vger.kernel.org>
---
 drivers/scsi/qla2xxx/qla_os.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index 0a000ecf0881..ae9c5a7b239a 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -1600,6 +1600,7 @@ qla2x00_abort_all_cmds(scsi_qla_host_t *vha, int res)
 	srb_t *sp;
 	struct qla_hw_data *ha = vha->hw;
 	struct req_que *req;
+	struct scsi_cmnd *scmd;
 
 	qlt_host_reset_handler(ha);
 
@@ -1613,6 +1614,8 @@ qla2x00_abort_all_cmds(scsi_qla_host_t *vha, int res)
 		for (cnt = 1; cnt < req->num_outstanding_cmds; cnt++) {
 			sp = req->outstanding_cmds[cnt];
 			if (sp) {
+				scmd = GET_CMD_SP(sp);
+
 				/* Don't abort commands in adapter during EEH
 				 * recovery as it's not accessible/responding.
 				 */
@@ -1624,7 +1627,8 @@ qla2x00_abort_all_cmds(scsi_qla_host_t *vha, int res)
 					 */
 					sp_get(sp);
 					spin_unlock_irqrestore(&ha->hardware_lock, flags);
-					qla2xxx_eh_abort(GET_CMD_SP(sp));
+					if (scmd)
+						qla2xxx_eh_abort(scmd);
 					spin_lock_irqsave(&ha->hardware_lock, flags);
 				}
 				req->outstanding_cmds[cnt] = NULL;
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] qla2xxx: Fix a recently introduced memory leak
  2017-01-23 16:34   ` Bart Van Assche
  (?)
@ 2017-01-23 16:45   ` Christoph Hellwig
  -1 siblings, 0 replies; 25+ messages in thread
From: Christoph Hellwig @ 2017-01-23 16:45 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Martin K . Petersen, linux-scsi, Michael Hernandez,
	Himanshu Madhani, Christoph Hellwig, stable

Thanks Bart,

this looks good to me.

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] qla2xxx: Fix a recently introduced memory leak
  2017-01-23 16:34   ` Bart Van Assche
  (?)
  (?)
@ 2017-01-23 17:04   ` Madhani, Himanshu
  -1 siblings, 0 replies; 25+ messages in thread
From: Madhani, Himanshu @ 2017-01-23 17:04 UTC (permalink / raw)
  To: Bart Van Assche, Martin K . Petersen
  Cc: linux-scsi, Hernandez, Michael, Christoph Hellwig, stable


On 1/23/17, 8:34 AM, "Bart Van Assche" <bart.vanassche@sandisk.com> wrote:

>qla2x00_probe_one() allocates IRQs before it initializes rsp_q_map
>so IRQs must be freed even if rsp_q_map allocation did not occur.
>This was detected by kmemleak.
>
>Fixes: 4fa183455988 ("scsi: qla2xxx: Utilize pci_alloc_irq_vectors/pci_free_irq_vectors calls")
>Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
>Cc: Michael Hernandez <michael.hernandez@cavium.com>
>Cc: Himanshu Madhani <himanshu.madhani@cavium.com>
>Cc: Christoph Hellwig <hch@lst.de>
>Cc: <stable@vger.kernel.org>
>---
> drivers/scsi/qla2xxx/qla_isr.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
>diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
>index dc88a09f9043..a94b0b6bd030 100644
>--- a/drivers/scsi/qla2xxx/qla_isr.c
>+++ b/drivers/scsi/qla2xxx/qla_isr.c
>@@ -3242,7 +3242,7 @@ qla2x00_free_irqs(scsi_qla_host_t *vha)
> 	 * from a probe failure context.
> 	 */
> 	if (!ha->rsp_q_map || !ha->rsp_q_map[0])
>-		return;
>+		goto free_irqs;
> 	rsp = ha->rsp_q_map[0];
> 
> 	if (ha->flags.msix_enabled) {
>@@ -3262,6 +3262,7 @@ qla2x00_free_irqs(scsi_qla_host_t *vha)
> 		free_irq(pci_irq_vector(ha->pdev, 0), rsp);
> 	}
> 
>+free_irqs:
> 	pci_free_irq_vectors(ha->pdev);
> }
> 
>-- 
>2.11.0

Thanks Bart. Looks good. 

Acked-By: Himanshu Madhani <himanshu.madhani@cavium.com>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/2] qla2xxx: Avoid that issuing a LIP triggers a kernel crash
  2017-01-23 16:34   ` Bart Van Assche
  (?)
@ 2017-01-23 17:41   ` Madhani, Himanshu
  -1 siblings, 0 replies; 25+ messages in thread
From: Madhani, Himanshu @ 2017-01-23 17:41 UTC (permalink / raw)
  To: Bart Van Assche, Martin K . Petersen
  Cc: linux-scsi, Naresh Bannoth, Mauricio Faria de Oliveira, stable


On 1/23/17, 8:34 AM, "Bart Van Assche" <bart.vanassche@sandisk.com> wrote:

>Avoid that issuing a LIP as follows:
>
>  find /sys -name 'issue_lip'|while read f; do echo 1 > $f; done
>
>triggers the following:
>
>BUG: unable to handle kernel NULL pointer dereference at (null)
>Call Trace:
> qla2x00_abort_all_cmds+0xed/0x140 [qla2xxx]
> qla2x00_abort_isp_cleanup+0x1e3/0x280 [qla2xxx]
> qla2x00_abort_isp+0xef/0x690 [qla2xxx]
> qla2x00_do_dpc+0x36c/0x880 [qla2xxx]
> kthread+0x10c/0x140
>
>Fixes: 1535aa75a3d8 ("qla2xxx: fix invalid DMA access after command aborts in PCI device remove")
>Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
>Cc: Naresh Bannoth <nbannoth@in.ibm.com>
>Cc: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
>Cc: Himanshu Madhani <himanshu.madhani@cavium.com>
>Cc: <stable@vger.kernel.org>
>---
> drivers/scsi/qla2xxx/qla_os.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
>diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
>index 0a000ecf0881..ae9c5a7b239a 100644
>--- a/drivers/scsi/qla2xxx/qla_os.c
>+++ b/drivers/scsi/qla2xxx/qla_os.c
>@@ -1600,6 +1600,7 @@ qla2x00_abort_all_cmds(scsi_qla_host_t *vha, int res)
> 	srb_t *sp;
> 	struct qla_hw_data *ha = vha->hw;
> 	struct req_que *req;
>+	struct scsi_cmnd *scmd;
> 
> 	qlt_host_reset_handler(ha);
> 
>@@ -1613,6 +1614,8 @@ qla2x00_abort_all_cmds(scsi_qla_host_t *vha, int res)
> 		for (cnt = 1; cnt < req->num_outstanding_cmds; cnt++) {
> 			sp = req->outstanding_cmds[cnt];
> 			if (sp) {
>+				scmd = GET_CMD_SP(sp);
>+
> 				/* Don't abort commands in adapter during EEH
> 				 * recovery as it's not accessible/responding.
> 				 */
>@@ -1624,7 +1627,8 @@ qla2x00_abort_all_cmds(scsi_qla_host_t *vha, int res)
> 					 */
> 					sp_get(sp);
> 					spin_unlock_irqrestore(&ha->hardware_lock, flags);
>-					qla2xxx_eh_abort(GET_CMD_SP(sp));
>+					if (scmd)
>+						qla2xxx_eh_abort(scmd);
> 					spin_lock_irqsave(&ha->hardware_lock, flags);
> 				}
> 				req->outstanding_cmds[cnt] = NULL;
>-- 
>2.11.0
>

Looks Good. 

Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] qla2xxx: Fix a recently introduced memory leak
  2017-01-23 16:34   ` Bart Van Assche
@ 2017-01-24 12:10     ` Johannes Thumshirn
  -1 siblings, 0 replies; 25+ messages in thread
From: Johannes Thumshirn @ 2017-01-24 12:10 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Martin K . Petersen, linux-scsi, Michael Hernandez,
	Himanshu Madhani, Christoph Hellwig, stable

On Mon, Jan 23, 2017 at 08:34:45AM -0800, Bart Van Assche wrote:
> qla2x00_probe_one() allocates IRQs before it initializes rsp_q_map
> so IRQs must be freed even if rsp_q_map allocation did not occur.
> This was detected by kmemleak.
> 
> Fixes: 4fa183455988 ("scsi: qla2xxx: Utilize pci_alloc_irq_vectors/pci_free_irq_vectors calls")
> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
> Cc: Michael Hernandez <michael.hernandez@cavium.com>
> Cc: Himanshu Madhani <himanshu.madhani@cavium.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: <stable@vger.kernel.org>
> ---

Looks good,
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>

-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: Felix Imend�rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N�rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] qla2xxx: Fix a recently introduced memory leak
@ 2017-01-24 12:10     ` Johannes Thumshirn
  0 siblings, 0 replies; 25+ messages in thread
From: Johannes Thumshirn @ 2017-01-24 12:10 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Martin K . Petersen, linux-scsi, Michael Hernandez,
	Himanshu Madhani, Christoph Hellwig, stable

On Mon, Jan 23, 2017 at 08:34:45AM -0800, Bart Van Assche wrote:
> qla2x00_probe_one() allocates IRQs before it initializes rsp_q_map
> so IRQs must be freed even if rsp_q_map allocation did not occur.
> This was detected by kmemleak.
> 
> Fixes: 4fa183455988 ("scsi: qla2xxx: Utilize pci_alloc_irq_vectors/pci_free_irq_vectors calls")
> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
> Cc: Michael Hernandez <michael.hernandez@cavium.com>
> Cc: Himanshu Madhani <himanshu.madhani@cavium.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: <stable@vger.kernel.org>
> ---

Looks good,
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>

-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/2] qla2xxx: Avoid that issuing a LIP triggers a kernel crash
  2017-01-23 16:34   ` Bart Van Assche
@ 2017-01-24 12:12     ` Johannes Thumshirn
  -1 siblings, 0 replies; 25+ messages in thread
From: Johannes Thumshirn @ 2017-01-24 12:12 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Martin K . Petersen, linux-scsi, Naresh Bannoth,
	Mauricio Faria de Oliveira, Himanshu Madhani, stable

On Mon, Jan 23, 2017 at 08:34:46AM -0800, Bart Van Assche wrote:
> Avoid that issuing a LIP as follows:
> 
>   find /sys -name 'issue_lip'|while read f; do echo 1 > $f; done
> 
> triggers the following:
> 
> BUG: unable to handle kernel NULL pointer dereference at (null)
> Call Trace:
>  qla2x00_abort_all_cmds+0xed/0x140 [qla2xxx]
>  qla2x00_abort_isp_cleanup+0x1e3/0x280 [qla2xxx]
>  qla2x00_abort_isp+0xef/0x690 [qla2xxx]
>  qla2x00_do_dpc+0x36c/0x880 [qla2xxx]
>  kthread+0x10c/0x140
> 
> Fixes: 1535aa75a3d8 ("qla2xxx: fix invalid DMA access after command aborts in PCI device remove")
> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
> Cc: Naresh Bannoth <nbannoth@in.ibm.com>
> Cc: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
> Cc: Himanshu Madhani <himanshu.madhani@cavium.com>
> Cc: <stable@vger.kernel.org>
> ---

Looks good,
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>

-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: Felix Imend�rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N�rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/2] qla2xxx: Avoid that issuing a LIP triggers a kernel crash
@ 2017-01-24 12:12     ` Johannes Thumshirn
  0 siblings, 0 replies; 25+ messages in thread
From: Johannes Thumshirn @ 2017-01-24 12:12 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Martin K . Petersen, linux-scsi, Naresh Bannoth,
	Mauricio Faria de Oliveira, Himanshu Madhani, stable

On Mon, Jan 23, 2017 at 08:34:46AM -0800, Bart Van Assche wrote:
> Avoid that issuing a LIP as follows:
> 
>   find /sys -name 'issue_lip'|while read f; do echo 1 > $f; done
> 
> triggers the following:
> 
> BUG: unable to handle kernel NULL pointer dereference at (null)
> Call Trace:
>  qla2x00_abort_all_cmds+0xed/0x140 [qla2xxx]
>  qla2x00_abort_isp_cleanup+0x1e3/0x280 [qla2xxx]
>  qla2x00_abort_isp+0xef/0x690 [qla2xxx]
>  qla2x00_do_dpc+0x36c/0x880 [qla2xxx]
>  kthread+0x10c/0x140
> 
> Fixes: 1535aa75a3d8 ("qla2xxx: fix invalid DMA access after command aborts in PCI device remove")
> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
> Cc: Naresh Bannoth <nbannoth@in.ibm.com>
> Cc: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
> Cc: Himanshu Madhani <himanshu.madhani@cavium.com>
> Cc: <stable@vger.kernel.org>
> ---

Looks good,
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>

-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/2] qla2xxx: Avoid that issuing a LIP triggers a kernel crash
  2017-01-23 16:34   ` Bart Van Assche
                     ` (2 preceding siblings ...)
  (?)
@ 2017-01-24 14:59   ` Mauricio Faria de Oliveira
  2017-01-25 22:05     ` Madhani, Himanshu
  2017-01-25 23:29     ` Martin K. Petersen
  -1 siblings, 2 replies; 25+ messages in thread
From: Mauricio Faria de Oliveira @ 2017-01-24 14:59 UTC (permalink / raw)
  To: Bart Van Assche, Martin K . Petersen
  Cc: linux-scsi, Naresh Bannoth, Himanshu Madhani, stable

Hi Bart,

First of all, sorry for the new bug; I didn't realize the pointer could
be NULL at this scenario.

On 01/23/2017 02:34 PM, Bart Van Assche wrote:
> @@ -1624,7 +1627,8 @@ qla2x00_abort_all_cmds(scsi_qla_host_t *vha, int res)
>  					 */
>  					sp_get(sp);
>  					spin_unlock_irqrestore(&ha->hardware_lock, flags);
> -					qla2xxx_eh_abort(GET_CMD_SP(sp));
> +					if (scmd)
> +						qla2xxx_eh_abort(scmd);
>  					spin_lock_irqsave(&ha->hardware_lock, flags);
>  				}

Now, this chunk has a problem with reference counting (and unnecessary
spin-locking), which we can avoid by simply moving up this NULL check.

The call to sp_get() increments the sp->ref_count, but if you skip the
call to qla2xxx_eh_abort() you don't get the decrement from the call to
sp->done() at abort handling from ISR, e.g., qla24xx_abort_iocb_entry().
[or if the command completed successfully between issue/complete abort,
at the completion from ISR, e.g., qla2x00_process_completed_request().]

The sp->done() call just below this chunk was supposed to drop the
initial reference [set at qla2xxx_queuecommand()] at a time we did
not call qla2xxx_eh_abort() yet... but now that we __may__ call it
(and get that sp->done() call from the ISR abort handling), we need
to only increment it if we're going to drop it.

That should be resolved with this slight change to your patch
(which also helps w/ the spin-locking).  What do you/others think?

diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index 0a000ecf0881..a17cb63b3fd5 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -1600,6 +1600,7 @@ uint32_t qla2x00_isp_reg_stat(struct qla_hw_data *ha)
         srb_t *sp;
         struct qla_hw_data *ha = vha->hw;
         struct req_que *req;
+       struct scsi_cmnd *scmd;

         qlt_host_reset_handler(ha);

@@ -1613,10 +1614,12 @@ uint32_t qla2x00_isp_reg_stat(struct qla_hw_data 
*ha)
                 for (cnt = 1; cnt < req->num_outstanding_cmds; cnt++) {
                         sp = req->outstanding_cmds[cnt];
                         if (sp) {
+                               scmd = GET_CMD_SP(sp);
+
                                 /* Don't abort commands in adapter 
during EEH
                                  * recovery as it's not 
accessible/responding.
                                  */
-                               if (!ha->flags.eeh_busy) {
+                               if (scmd && !ha->flags.eeh_busy) {
                                         /* Get a reference to the sp 
and drop the lock.
                                          * The reference ensures this 
sp->done() call
                                          * - and not the call in 
qla2xxx_eh_abort() -
@@ -1624,7 +1627,7 @@ uint32_t qla2x00_isp_reg_stat(struct qla_hw_data *ha)
                                          */
                                         sp_get(sp);
 
spin_unlock_irqrestore(&ha->hardware_lock, flags);
-                                       qla2xxx_eh_abort(GET_CMD_SP(sp));
+                                       qla2xxx_eh_abort(scmd);
 
spin_lock_irqsave(&ha->hardware_lock, flags);
                                 }
                                 req->outstanding_cmds[cnt] = NULL;


Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>


-- 
Mauricio Faria de Oliveira
IBM Linux Technology Center


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] qla2xxx: Fix a recently introduced memory leak
  2017-01-23 16:34   ` Bart Van Assche
                     ` (3 preceding siblings ...)
  (?)
@ 2017-01-25 15:47   ` Bart Van Assche
  2017-01-26 14:36     ` hch
  -1 siblings, 1 reply; 25+ messages in thread
From: Bart Van Assche @ 2017-01-25 15:47 UTC (permalink / raw)
  To: hch; +Cc: linux-scsi

On Mon, 2017-01-23 at 08:34 -0800, Bart Van Assche wrote:
> qla2x00_probe_one() allocates IRQs before it initializes rsp_q_map
> so IRQs must be freed even if rsp_q_map allocation did not occur.
> This was detected by kmemleak.
> 
> Fixes: 4fa183455988 ("scsi: qla2xxx: Utilize pci_alloc_irq_vectors/pci_free_irq_vectors calls")
> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
> Cc: Michael Hernandez <michael.hernandez@cavium.com>
> Cc: Himanshu Madhani <himanshu.madhani@cavium.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: <stable@vger.kernel.org>
> ---
>  drivers/scsi/qla2xxx/qla_isr.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
> index dc88a09f9043..a94b0b6bd030 100644
> --- a/drivers/scsi/qla2xxx/qla_isr.c
> +++ b/drivers/scsi/qla2xxx/qla_isr.c
> @@ -3242,7 +3242,7 @@ qla2x00_free_irqs(scsi_qla_host_t *vha)
>  	 * from a probe failure context.
>  	 */
>  	if (!ha->rsp_q_map || !ha->rsp_q_map[0])
> -		return;
> +		goto free_irqs;
>  	rsp = ha->rsp_q_map[0];
>  
>  	if (ha->flags.msix_enabled) {
> @@ -3262,6 +3262,7 @@ qla2x00_free_irqs(scsi_qla_host_t *vha)
>  		free_irq(pci_irq_vector(ha->pdev, 0), rsp);
>  	}
>  
> +free_irqs:
>  	pci_free_irq_vectors(ha->pdev);
>  }

Hello Christoph,

When I tested this patch for the first time on my test setup all my tests
passed. However, when I retested this patch yesterday the SLUB debug code
triggered a complaint. I don't see how my patch could have caused this
complaint. Can you or someone else who is familiar with the
pci_alloc_irq_vectors_affinity() / pci_free_irq_vectors() have a look at
the output below? That output is triggered every time the qla2xxx kernel
is loaded in a virtual machine to which two QLogic FC adapters had been
assigned via PCIe passthrough.

Thanks,

Bart.

qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 8.07.00.38-k.
qla2xxx [0000:00:09.0]-001d: : Found an ISP2432 irq 10 iobase 0xffffc9000009d000.
=============================================================================
BUG kmalloc-16 (Not tainted): Redzone overwritten
-----------------------------------------------------------------------------

Disabling lock debugging due to kernel taint
INFO: 0xffff880030bacc78-0xffff880030bacc7f. First byte 0xf instead of 0xcc
INFO: Allocated in irq_create_affinity_masks+0x5f/0x260 age=0 cpu=3 pid=812
	___slab_alloc.constprop.79+0x482/0x4f0
	__slab_alloc.isra.75.constprop.78+0x55/0xa0
	__kmalloc+0x27c/0x310
	irq_create_affinity_masks+0x5f/0x260
	__pci_enable_msix+0x314/0x4c0
	pci_alloc_irq_vectors_affinity+0xb7/0x140
	qla2x00_request_irqs+0xa6/0x6d0 [qla2xxx]
	qla2x00_probe_one+0xc2e/0x25f0 [qla2xxx]
	pci_device_probe+0x8a/0xf0
	driver_probe_device+0x1f5/0x450
	__driver_attach+0xe3/0xf0
	bus_for_each_dev+0x66/0xa0
	driver_attach+0x1e/0x20
	bus_add_driver+0x200/0x270
	driver_register+0x60/0xe0
	__pci_register_driver+0x5d/0x60
INFO: Freed in acpi_ns_get_node_unlocked+0x90/0xa4 age=0 cpu=3 pid=812
	__slab_free+0x176/0x310
	kfree+0x25e/0x2d0
	acpi_ns_get_node_unlocked+0x90/0xa4
	acpi_ns_get_node+0x3d/0x52
	acpi_get_handle+0x82/0x96
	acpi_pci_irq_find_prt_entry+0x26e/0x2ae
	acpi_pci_irq_lookup+0x28/0x135
	acpi_pci_irq_enable+0x60/0x1f8
	pcibios_enable_device+0x2d/0x30
	do_pci_enable_device+0x64/0xf0
	pci_enable_device_flags+0xc5/0x110
	pci_enable_device_mem+0x13/0x20
	qla2x00_probe_one+0x14b/0x25f0 [qla2xxx]
	pci_device_probe+0x8a/0xf0
	driver_probe_device+0x1f5/0x450
	__driver_attach+0xe3/0xf0
INFO: Slab 0xffffea0000c2eb00 objects=23 used=21 fp=0xffff880030bacdc8 flags=0x4000000000008101
INFO: Object 0xffff880030bacc68 @offset=3176 fp=0xffff880030bacf28

Redzone ffff880030bacc60: cc cc cc cc cc cc cc cc                          ........
Object ffff880030bacc68: ff 00 00 00 00 00 00 00 ff 00 00 00 00 00 00 00  ................
Redzone ffff880030bacc78: 0f 00 00 00 00 00 00 00                          ........
Padding ffff880030bacdb8: 5a 5a 5a 5a 5a 5a 5a 5a                          ZZZZZZZZ
CPU: 3 PID: 812 Comm: modprobe Tainted: G    B           4.10.0-rc5-dbg+ #9
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Call Trace:
 dump_stack+0x85/0xc2
 print_trailer+0x162/0x260
 check_bytes_and_report+0xc5/0x110
 check_object+0x1da/0x2a0
 free_debug_processing+0x161/0x3d0
 ? debug_lockdep_rcu_enabled+0x1d/0x20
 ? __pci_enable_msix+0x41c/0x4c0
 __slab_free+0x176/0x310
 ? __pci_enable_msix+0x41c/0x4c0
 ? call_rcu+0x17/0x20
 ? kfree+0xe7/0x2d0
 ? __pci_enable_msix+0x41c/0x4c0
 ? __pci_enable_msix+0x41c/0x4c0
 kfree+0x25e/0x2d0
 __pci_enable_msix+0x41c/0x4c0
 pci_alloc_irq_vectors_affinity+0xb7/0x140
 qla2x00_request_irqs+0xa6/0x6d0 [qla2xxx]
 qla2x00_probe_one+0xc2e/0x25f0 [qla2xxx]
 ? __pm_runtime_resume+0x40/0x80
 ? trace_hardirqs_on_caller+0x128/0x1b0
 ? trace_hardirqs_on+0xd/0x10
 ? _raw_spin_unlock_irqrestore+0x4a/0x80
 pci_device_probe+0x8a/0xf0
 driver_probe_device+0x1f5/0x450
 __driver_attach+0xe3/0xf0
 ? driver_probe_device+0x450/0x450
 bus_for_each_dev+0x66/0xa0
 driver_attach+0x1e/0x20
 bus_add_driver+0x200/0x270
 ? 0xffffffffa04eb000
 driver_register+0x60/0xe0
 ? 0xffffffffa04eb000
 __pci_register_driver+0x5d/0x60
 qla2x00_module_init+0x1c9/0x217 [qla2xxx]
 do_one_initcall+0x44/0x180
 ? rcu_read_lock_sched_held+0x72/0x80
 ? kmem_cache_alloc_trace+0x25b/0x2c0
 ? do_init_module+0x27/0x1f9
 do_init_module+0x5f/0x1f9
 load_module+0x2582/0x2a00
 ? __symbol_put+0x70/0x70
 ? kernel_read_file+0x10a/0x1a0
 ? kernel_read_file_from_fd+0x49/0x80
 SYSC_finit_module+0xbc/0xf0
 SyS_finit_module+0xe/0x10
 entry_SYSCALL_64_fastpath+0x23/0xc6
RIP: 0033:0x7f05711388e9
RSP: 002b:00007fff51d4a0f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f05711388e9
RDX: 0000000000000000 RSI: 000055c17ab4f720 RDI: 0000000000000004
RBP: 00007fff51d49100 R08: 0000000000000000 R09: 0000000000000019
R10: 0000000000000004 R11: 0000000000000246 R12: 000055c17ab4f570
R13: 00007fff51d490e0 R14: 0000000000000005 R15: 0000000000040000
FIX kmalloc-16: Restoring 0xffff880030bacc78-0xffff880030bacc7f=0xcc

FIX kmalloc-16: Object at 0xffff880030bacc68 not freed
scsi host2: qla2xxx
qla2xxx [0000:00:09.0]-00fb:2: QLogic QLE2460 - QLogic 4GB FC Single-Port PCI-E HBA for IBM System x.
qla2xxx [0000:00:09.0]-00fc:2: ISP2432: PCIe (2.5GT/s x4) @ 0000:00:09.0 hdma- host#=2 fw=8.03.00 (9496).
qla2xxx [0000:00:0a.0]-001d: : Found an ISP2432 irq 10 iobase 0xffffc900000ad000.
=============================================================================
BUG kmalloc-16 (Tainted: G    B          ): Redzone overwritten
-----------------------------------------------------------------------------

INFO: 0xffff88006ff18dd8-0xffff88006ff18ddf. First byte 0xf instead of 0xcc
INFO: Allocated in irq_create_affinity_masks+0x5f/0x260 age=0 cpu=2 pid=812
	___slab_alloc.constprop.79+0x482/0x4f0
	__slab_alloc.isra.75.constprop.78+0x55/0xa0
	__kmalloc+0x27c/0x310
	irq_create_affinity_masks+0x5f/0x260
	__pci_enable_msix+0x314/0x4c0
	pci_alloc_irq_vectors_affinity+0xb7/0x140
	qla2x00_request_irqs+0xa6/0x6d0 [qla2xxx]
	qla2x00_probe_one+0xc2e/0x25f0 [qla2xxx]
	pci_device_probe+0x8a/0xf0
	driver_probe_device+0x1f5/0x450
	__driver_attach+0xe3/0xf0
	bus_for_each_dev+0x66/0xa0
	driver_attach+0x1e/0x20
	bus_add_driver+0x200/0x270
	driver_register+0x60/0xe0
	__pci_register_driver+0x5d/0x60
INFO: Freed in acpi_ns_get_node_unlocked+0x90/0xa4 age=1 cpu=2 pid=812
	__slab_free+0x176/0x310
	kfree+0x25e/0x2d0
	acpi_ns_get_node_unlocked+0x90/0xa4
	acpi_ns_get_node+0x3d/0x52
	acpi_get_handle+0x82/0x96
	acpi_pci_irq_find_prt_entry+0x26e/0x2ae
	acpi_pci_irq_lookup+0x28/0x135
	acpi_pci_irq_enable+0x60/0x1f8
	pcibios_enable_device+0x2d/0x30
	do_pci_enable_device+0x64/0xf0
	pci_enable_device_flags+0xc5/0x110
	pci_enable_device_mem+0x13/0x20
	qla2x00_probe_one+0x14b/0x25f0 [qla2xxx]
	pci_device_probe+0x8a/0xf0
	driver_probe_device+0x1f5/0x450
	__driver_attach+0xe3/0xf0
INFO: Slab 0xffffea0001bfc600 objects=23 used=22 fp=0xffff88006ff18f28 flags=0x4000000000008101
INFO: Object 0xffff88006ff18dc8 @offset=3528 fp=0xffff88006ff18f28

Redzone ffff88006ff18dc0: cc cc cc cc cc cc cc cc                          ........
Object ffff88006ff18dc8: ff 00 00 00 00 00 00 00 ff 00 00 00 00 00 00 00  ................
Redzone ffff88006ff18dd8: 0f 00 00 00 00 00 00 00                          ........
Padding ffff88006ff18f18: 5a 5a 5a 5a 5a 5a 5a 5a                          ZZZZZZZZ
CPU: 2 PID: 812 Comm: modprobe Tainted: G    B           4.10.0-rc5-dbg+ #9
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Call Trace:
 dump_stack+0x85/0xc2
 print_trailer+0x162/0x260
 check_bytes_and_report+0xc5/0x110
 check_object+0x1da/0x2a0
 free_debug_processing+0x161/0x3d0
 ? __pci_enable_msix+0x41c/0x4c0
 __slab_free+0x176/0x310
 ? __pci_enable_msix+0x41c/0x4c0
 ? call_rcu+0x17/0x20
 ? put_object+0x2d/0x50
 ? __delete_object+0x3d/0x70
 ? __pci_enable_msix+0x41c/0x4c0
 kfree+0x25e/0x2d0
 __pci_enable_msix+0x41c/0x4c0
 pci_alloc_irq_vectors_affinity+0xb7/0x140
 qla2x00_request_irqs+0xa6/0x6d0 [qla2xxx]
 qla2x00_probe_one+0xc2e/0x25f0 [qla2xxx]
 ? __pm_runtime_resume+0x40/0x80
 ? trace_hardirqs_on+0xd/0x10
 ? _raw_spin_unlock_irqrestore+0x4a/0x80
 pci_device_probe+0x8a/0xf0
 driver_probe_device+0x1f5/0x450
 __driver_attach+0xe3/0xf0
 ? driver_probe_device+0x450/0x450
 bus_for_each_dev+0x66/0xa0
 driver_attach+0x1e/0x20
 bus_add_driver+0x200/0x270
 ? 0xffffffffa04eb000
 driver_register+0x60/0xe0
 ? 0xffffffffa04eb000
 __pci_register_driver+0x5d/0x60
 qla2x00_module_init+0x1c9/0x217 [qla2xxx]
 do_one_initcall+0x44/0x180
 ? rcu_read_lock_sched_held+0x72/0x80
 ? kmem_cache_alloc_trace+0x25b/0x2c0
 ? do_init_module+0x27/0x1f9
 do_init_module+0x5f/0x1f9
 load_module+0x2582/0x2a00
 ? __symbol_put+0x70/0x70
 ? kernel_read_file+0x10a/0x1a0
 ? kernel_read_file_from_fd+0x49/0x80
 SYSC_finit_module+0xbc/0xf0
 SyS_finit_module+0xe/0x10
 entry_SYSCALL_64_fastpath+0x23/0xc6
RIP: 0033:0x7f05711388e9
RSP: 002b:00007fff51d4a0f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f05711388e9
RDX: 0000000000000000 RSI: 000055c17ab4f720 RDI: 0000000000000004
RBP: 00007fff51d49100 R08: 0000000000000000 R09: 0000000000000019
R10: 0000000000000004 R11: 0000000000000246 R12: 000055c17ab4f570
R13: 00007fff51d490e0 R14: 0000000000000005 R15: 0000000000040000
FIX kmalloc-16: Restoring 0xffff88006ff18dd8-0xffff88006ff18ddf=0xcc

FIX kmalloc-16: Object at 0xffff88006ff18dc8 not freed
scsi host3: qla2xxx
qla2xxx [0000:00:09.0]-500a:2: LOOP UP detected (4 Gbps).
qla2xxx [0000:00:0a.0]-500a:3: LOOP UP detected (4 Gbps).


(gdb) list *(__pci_enable_msix+0x314)
0xffffffff8131aa74 is in __pci_enable_msix (drivers/pci/msi.c:702).
697             struct msi_desc *entry;
698             int ret, i;
699
700             if (affd) {
701                     masks = irq_create_affinity_masks(nvec, affd);
702                     if (!masks)
703                             pr_err("Unable to allocate affinity masks, ignoring\n");
704             }
705
706             for (i = 0, curmsk = masks; i < nvec; i++) {
(gdb) list *(__pci_enable_msix+0x41c)
0xffffffff8131ab7c is in __pci_enable_msix (drivers/pci/msi.c:783).
778
779             ret = msix_setup_entries(dev, base, entries, nvec, affd);
780             if (ret)
781                     return ret;
782
783             ret = pci_msi_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSIX);
784             if (ret)
785                     goto out_avail;
786
787             /* Check if all MSI entries honor device restrictions */
(gdb) quit

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/2] qla2xxx: Avoid that issuing a LIP triggers a kernel crash
  2017-01-24 14:59   ` Mauricio Faria de Oliveira
@ 2017-01-25 22:05     ` Madhani, Himanshu
  2017-01-25 23:29     ` Martin K. Petersen
  1 sibling, 0 replies; 25+ messages in thread
From: Madhani, Himanshu @ 2017-01-25 22:05 UTC (permalink / raw)
  To: Mauricio Faria de Oliveira, Bart Van Assche, Martin K . Petersen
  Cc: linux-scsi, Naresh Bannoth, stable



On 1/24/17, 6:59 AM, "Mauricio Faria de Oliveira" <mauricfo@linux.vnet.ibm.com> wrote:

>Hi Bart,
>
>First of all, sorry for the new bug; I didn't realize the pointer could
>be NULL at this scenario.
>
>On 01/23/2017 02:34 PM, Bart Van Assche wrote:
>> @@ -1624,7 +1627,8 @@ qla2x00_abort_all_cmds(scsi_qla_host_t *vha, int res)
>>  					 */
>>  					sp_get(sp);
>>  					spin_unlock_irqrestore(&ha->hardware_lock, flags);
>> -					qla2xxx_eh_abort(GET_CMD_SP(sp));
>> +					if (scmd)
>> +						qla2xxx_eh_abort(scmd);
>>  					spin_lock_irqsave(&ha->hardware_lock, flags);
>>  				}
>
>Now, this chunk has a problem with reference counting (and unnecessary
>spin-locking), which we can avoid by simply moving up this NULL check.
>
>The call to sp_get() increments the sp->ref_count, but if you skip the
>call to qla2xxx_eh_abort() you don't get the decrement from the call to
>sp->done() at abort handling from ISR, e.g., qla24xx_abort_iocb_entry().
>[or if the command completed successfully between issue/complete abort,
>at the completion from ISR, e.g., qla2x00_process_completed_request().]
>
>The sp->done() call just below this chunk was supposed to drop the
>initial reference [set at qla2xxx_queuecommand()] at a time we did
>not call qla2xxx_eh_abort() yet... but now that we __may__ call it
>(and get that sp->done() call from the ISR abort handling), we need
>to only increment it if we're going to drop it.
>
>That should be resolved with this slight change to your patch
>(which also helps w/ the spin-locking).  What do you/others think?
>
>diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
>index 0a000ecf0881..a17cb63b3fd5 100644
>--- a/drivers/scsi/qla2xxx/qla_os.c
>+++ b/drivers/scsi/qla2xxx/qla_os.c
>@@ -1600,6 +1600,7 @@ uint32_t qla2x00_isp_reg_stat(struct qla_hw_data *ha)
>         srb_t *sp;
>         struct qla_hw_data *ha = vha->hw;
>         struct req_que *req;
>+       struct scsi_cmnd *scmd;
>
>         qlt_host_reset_handler(ha);
>
>@@ -1613,10 +1614,12 @@ uint32_t qla2x00_isp_reg_stat(struct qla_hw_data 
>*ha)
>                 for (cnt = 1; cnt < req->num_outstanding_cmds; cnt++) {
>                         sp = req->outstanding_cmds[cnt];
>                         if (sp) {
>+                               scmd = GET_CMD_SP(sp);
>+
>                                 /* Don't abort commands in adapter 
>during EEH
>                                  * recovery as it's not 
>accessible/responding.
>                                  */
>-                               if (!ha->flags.eeh_busy) {
>+                               if (scmd && !ha->flags.eeh_busy) {
>                                         /* Get a reference to the sp 
>and drop the lock.
>                                          * The reference ensures this 
>sp->done() call
>                                          * - and not the call in 
>qla2xxx_eh_abort() -
>@@ -1624,7 +1627,7 @@ uint32_t qla2x00_isp_reg_stat(struct qla_hw_data *ha)
>                                          */
>                                         sp_get(sp);
> 
>spin_unlock_irqrestore(&ha->hardware_lock, flags);
>-                                       qla2xxx_eh_abort(GET_CMD_SP(sp));
>+                                       qla2xxx_eh_abort(scmd);
> 
>spin_lock_irqsave(&ha->hardware_lock, flags);
>                                 }
>                                 req->outstanding_cmds[cnt] = NULL;
>
>
>Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
>
>
>-- 
>Mauricio Faria de Oliveira
>IBM Linux Technology Center

This is more appropriate fix. Looks good.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] qla2xxx: Fix a recently introduced memory leak
  2017-01-23 16:34   ` Bart Van Assche
@ 2017-01-25 23:28     ` Martin K. Petersen
  -1 siblings, 0 replies; 25+ messages in thread
From: Martin K. Petersen @ 2017-01-25 23:28 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Martin K . Petersen, linux-scsi, Michael Hernandez,
	Himanshu Madhani, Christoph Hellwig, stable

>>>>> "Bart" == Bart Van Assche <bart.vanassche@sandisk.com> writes:

Bart> qla2x00_probe_one() allocates IRQs before it initializes rsp_q_map
Bart> so IRQs must be freed even if rsp_q_map allocation did not occur.
Bart> This was detected by kmemleak.

I queued this one yesterday but was waiting for a resolution on patch
2...

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] qla2xxx: Fix a recently introduced memory leak
@ 2017-01-25 23:28     ` Martin K. Petersen
  0 siblings, 0 replies; 25+ messages in thread
From: Martin K. Petersen @ 2017-01-25 23:28 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Martin K . Petersen, linux-scsi, Michael Hernandez,
	Himanshu Madhani, Christoph Hellwig, stable

>>>>> "Bart" == Bart Van Assche <bart.vanassche@sandisk.com> writes:

Bart> qla2x00_probe_one() allocates IRQs before it initializes rsp_q_map
Bart> so IRQs must be freed even if rsp_q_map allocation did not occur.
Bart> This was detected by kmemleak.

I queued this one yesterday but was waiting for a resolution on patch
2...

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/2] qla2xxx: Avoid that issuing a LIP triggers a kernel crash
  2017-01-24 14:59   ` Mauricio Faria de Oliveira
  2017-01-25 22:05     ` Madhani, Himanshu
@ 2017-01-25 23:29     ` Martin K. Petersen
  2017-01-26  0:09       ` Mauricio Faria de Oliveira
  1 sibling, 1 reply; 25+ messages in thread
From: Martin K. Petersen @ 2017-01-25 23:29 UTC (permalink / raw)
  To: Mauricio Faria de Oliveira
  Cc: Bart Van Assche, Martin K . Petersen, linux-scsi, Naresh Bannoth,
	Himanshu Madhani, stable

>>>>> "Mauricio" == Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> writes:

Hi Mauricio,

Mauricio> First of all, sorry for the new bug; I didn't realize the
Mauricio> pointer could be NULL at this scenario.

Please do a proper patch submission for this fix.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/2] qla2xxx: Avoid that issuing a LIP triggers a kernel crash
  2017-01-25 23:29     ` Martin K. Petersen
@ 2017-01-26  0:09       ` Mauricio Faria de Oliveira
  0 siblings, 0 replies; 25+ messages in thread
From: Mauricio Faria de Oliveira @ 2017-01-26  0:09 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Bart Van Assche, linux-scsi, Naresh Bannoth, Himanshu Madhani, stable

Hi Martin,

On 01/25/2017 09:29 PM, Martin K. Petersen wrote:
> Please do a proper patch submission for this fix.

Okay, I submitted a v2 patch w/ the suggested change.

However, the original patch has been submitted by Bart,
so I believe credit is due, but not sure how to handle
this case.

Thus, please feel free to change the sign-off line as
appropriate here.

Thanks,

-- 
Mauricio Faria de Oliveira
IBM Linux Technology Center


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] qla2xxx: Fix a recently introduced memory leak
  2017-01-25 15:47   ` Bart Van Assche
@ 2017-01-26 14:36     ` hch
  2017-01-29  5:17       ` Bart Van Assche
  0 siblings, 1 reply; 25+ messages in thread
From: hch @ 2017-01-26 14:36 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: linux-scsi, linux-acpi, linux-kernel

On Wed, Jan 25, 2017 at 03:47:20PM +0000, Bart Van Assche wrote:
> =============================================================================
> BUG kmalloc-16 (Not tainted): Redzone overwritten
> -----------------------------------------------------------------------------
> 
> Disabling lock debugging due to kernel taint
> INFO: 0xffff880030bacc78-0xffff880030bacc7f. First byte 0xf instead of 0xcc
> INFO: Allocated in irq_create_affinity_masks+0x5f/0x260 age=0 cpu=3 pid=812
> 	___slab_alloc.constprop.79+0x482/0x4f0
> 	__slab_alloc.isra.75.constprop.78+0x55/0xa0
> 	__kmalloc+0x27c/0x310
> 	irq_create_affinity_masks+0x5f/0x260

This is the normal affinity mask allocation.

> 	__pci_enable_msix+0x314/0x4c0
> 	pci_alloc_irq_vectors_affinity+0xb7/0x140
> 	qla2x00_request_irqs+0xa6/0x6d0 [qla2xxx]
> 	qla2x00_probe_one+0xc2e/0x25f0 [qla2xxx]
> 	pci_device_probe+0x8a/0xf0
> 	driver_probe_device+0x1f5/0x450
> 	__driver_attach+0xe3/0xf0
> 	bus_for_each_dev+0x66/0xa0
> 	driver_attach+0x1e/0x20
> 	bus_add_driver+0x200/0x270
> 	driver_register+0x60/0xe0
> 	__pci_register_driver+0x5d/0x60
> INFO: Freed in acpi_ns_get_node_unlocked+0x90/0xa4 age=0 cpu=3 pid=812
> 	__slab_free+0x176/0x310
> 	kfree+0x25e/0x2d0
> 	acpi_ns_get_node_unlocked+0x90/0xa4
> 	acpi_ns_get_node+0x3d/0x52
> 	acpi_get_handle+0x82/0x96

This on the other hand I don't understand acpi_ns_get_node_unlocked
only frees the object it allocated in the ACPI code using
acpi_ns_internalize_name.  I can't really see any relation to the
affinity mask allocation.

> 	acpi_pci_irq_find_prt_entry+0x26e/0x2ae
> 	acpi_pci_irq_lookup+0x28/0x135
> 	acpi_pci_irq_enable+0x60/0x1f8
> 	pcibios_enable_device+0x2d/0x30
> 	do_pci_enable_device+0x64/0xf0
> 	pci_enable_device_flags+0xc5/0x110
> 	pci_enable_device_mem+0x13/0x20
> 	qla2x00_probe_one+0x14b/0x25f0 [qla2xxx]
> 	pci_device_probe+0x8a/0xf0
> 	driver_probe_device+0x1f5/0x450
> 	__driver_attach+0xe3/0xf0
> INFO: Slab 0xffffea0000c2eb00 objects=23 used=21 fp=0xffff880030bacdc8 flags=0x4000000000008101
> INFO: Object 0xffff880030bacc68 @offset=3176 fp=0xffff880030bacf28
> 
> Redzone ffff880030bacc60: cc cc cc cc cc cc cc cc                          ........
> Object ffff880030bacc68: ff 00 00 00 00 00 00 00 ff 00 00 00 00 00 00 00  ................
> Redzone ffff880030bacc78: 0f 00 00 00 00 00 00 00                          ........
> Padding ffff880030bacdb8: 5a 5a 5a 5a 5a 5a 5a 5a                          ZZZZZZZZ
> CPU: 3 PID: 812 Comm: modprobe Tainted: G    B           4.10.0-rc5-dbg+ #9
> Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> Call Trace:
>  dump_stack+0x85/0xc2
>  print_trailer+0x162/0x260
>  check_bytes_and_report+0xc5/0x110
>  check_object+0x1da/0x2a0
>  free_debug_processing+0x161/0x3d0
>  ? debug_lockdep_rcu_enabled+0x1d/0x20
>  ? __pci_enable_msix+0x41c/0x4c0
>  __slab_free+0x176/0x310
>  ? __pci_enable_msix+0x41c/0x4c0
>  ? call_rcu+0x17/0x20
>  ? kfree+0xe7/0x2d0
>  ? __pci_enable_msix+0x41c/0x4c0
>  ? __pci_enable_msix+0x41c/0x4c0
>  kfree+0x25e/0x2d0
>  __pci_enable_msix+0x41c/0x4c0
>  pci_alloc_irq_vectors_affinity+0xb7/0x140
>  qla2x00_request_irqs+0xa6/0x6d0 [qla2xxx]
>  qla2x00_probe_one+0xc2e/0x25f0 [qla2xxx]
>  ? __pm_runtime_resume+0x40/0x80
>  ? trace_hardirqs_on_caller+0x128/0x1b0
>  ? trace_hardirqs_on+0xd/0x10
>  ? _raw_spin_unlock_irqrestore+0x4a/0x80
>  pci_device_probe+0x8a/0xf0
>  driver_probe_device+0x1f5/0x450
>  __driver_attach+0xe3/0xf0
>  ? driver_probe_device+0x450/0x450
>  bus_for_each_dev+0x66/0xa0
>  driver_attach+0x1e/0x20
>  bus_add_driver+0x200/0x270
>  ? 0xffffffffa04eb000
>  driver_register+0x60/0xe0
>  ? 0xffffffffa04eb000
>  __pci_register_driver+0x5d/0x60
>  qla2x00_module_init+0x1c9/0x217 [qla2xxx]
>  do_one_initcall+0x44/0x180
>  ? rcu_read_lock_sched_held+0x72/0x80
>  ? kmem_cache_alloc_trace+0x25b/0x2c0
>  ? do_init_module+0x27/0x1f9
>  do_init_module+0x5f/0x1f9
>  load_module+0x2582/0x2a00
>  ? __symbol_put+0x70/0x70
>  ? kernel_read_file+0x10a/0x1a0
>  ? kernel_read_file_from_fd+0x49/0x80
>  SYSC_finit_module+0xbc/0xf0
>  SyS_finit_module+0xe/0x10
>  entry_SYSCALL_64_fastpath+0x23/0xc6
> RIP: 0033:0x7f05711388e9
> RSP: 002b:00007fff51d4a0f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
> RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f05711388e9
> RDX: 0000000000000000 RSI: 000055c17ab4f720 RDI: 0000000000000004
> RBP: 00007fff51d49100 R08: 0000000000000000 R09: 0000000000000019
> R10: 0000000000000004 R11: 0000000000000246 R12: 000055c17ab4f570
> R13: 00007fff51d490e0 R14: 0000000000000005 R15: 0000000000040000
> FIX kmalloc-16: Restoring 0xffff880030bacc78-0xffff880030bacc7f=0xcc
> 
> FIX kmalloc-16: Object at 0xffff880030bacc68 not freed
> scsi host2: qla2xxx
> qla2xxx [0000:00:09.0]-00fb:2: QLogic QLE2460 - QLogic 4GB FC Single-Port PCI-E HBA for IBM System x.
> qla2xxx [0000:00:09.0]-00fc:2: ISP2432: PCIe (2.5GT/s x4) @ 0000:00:09.0 hdma- host#=2 fw=8.03.00 (9496).
> qla2xxx [0000:00:0a.0]-001d: : Found an ISP2432 irq 10 iobase 0xffffc900000ad000.
> =============================================================================
> BUG kmalloc-16 (Tainted: G    B          ): Redzone overwritten
> -----------------------------------------------------------------------------
> 
> INFO: 0xffff88006ff18dd8-0xffff88006ff18ddf. First byte 0xf instead of 0xcc
> INFO: Allocated in irq_create_affinity_masks+0x5f/0x260 age=0 cpu=2 pid=812
> 	___slab_alloc.constprop.79+0x482/0x4f0
> 	__slab_alloc.isra.75.constprop.78+0x55/0xa0
> 	__kmalloc+0x27c/0x310
> 	irq_create_affinity_masks+0x5f/0x260
> 	__pci_enable_msix+0x314/0x4c0
> 	pci_alloc_irq_vectors_affinity+0xb7/0x140
> 	qla2x00_request_irqs+0xa6/0x6d0 [qla2xxx]
> 	qla2x00_probe_one+0xc2e/0x25f0 [qla2xxx]
> 	pci_device_probe+0x8a/0xf0
> 	driver_probe_device+0x1f5/0x450
> 	__driver_attach+0xe3/0xf0
> 	bus_for_each_dev+0x66/0xa0
> 	driver_attach+0x1e/0x20
> 	bus_add_driver+0x200/0x270
> 	driver_register+0x60/0xe0
> 	__pci_register_driver+0x5d/0x60
> INFO: Freed in acpi_ns_get_node_unlocked+0x90/0xa4 age=1 cpu=2 pid=812
> 	__slab_free+0x176/0x310
> 	kfree+0x25e/0x2d0
> 	acpi_ns_get_node_unlocked+0x90/0xa4
> 	acpi_ns_get_node+0x3d/0x52
> 	acpi_get_handle+0x82/0x96
> 	acpi_pci_irq_find_prt_entry+0x26e/0x2ae
> 	acpi_pci_irq_lookup+0x28/0x135
> 	acpi_pci_irq_enable+0x60/0x1f8
> 	pcibios_enable_device+0x2d/0x30
> 	do_pci_enable_device+0x64/0xf0
> 	pci_enable_device_flags+0xc5/0x110
> 	pci_enable_device_mem+0x13/0x20
> 	qla2x00_probe_one+0x14b/0x25f0 [qla2xxx]
> 	pci_device_probe+0x8a/0xf0
> 	driver_probe_device+0x1f5/0x450
> 	__driver_attach+0xe3/0xf0
> INFO: Slab 0xffffea0001bfc600 objects=23 used=22 fp=0xffff88006ff18f28 flags=0x4000000000008101
> INFO: Object 0xffff88006ff18dc8 @offset=3528 fp=0xffff88006ff18f28
> 
> Redzone ffff88006ff18dc0: cc cc cc cc cc cc cc cc                          ........
> Object ffff88006ff18dc8: ff 00 00 00 00 00 00 00 ff 00 00 00 00 00 00 00  ................
> Redzone ffff88006ff18dd8: 0f 00 00 00 00 00 00 00                          ........
> Padding ffff88006ff18f18: 5a 5a 5a 5a 5a 5a 5a 5a                          ZZZZZZZZ
> CPU: 2 PID: 812 Comm: modprobe Tainted: G    B           4.10.0-rc5-dbg+ #9
> Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> Call Trace:
>  dump_stack+0x85/0xc2
>  print_trailer+0x162/0x260
>  check_bytes_and_report+0xc5/0x110
>  check_object+0x1da/0x2a0
>  free_debug_processing+0x161/0x3d0
>  ? __pci_enable_msix+0x41c/0x4c0
>  __slab_free+0x176/0x310
>  ? __pci_enable_msix+0x41c/0x4c0
>  ? call_rcu+0x17/0x20
>  ? put_object+0x2d/0x50
>  ? __delete_object+0x3d/0x70
>  ? __pci_enable_msix+0x41c/0x4c0
>  kfree+0x25e/0x2d0
>  __pci_enable_msix+0x41c/0x4c0
>  pci_alloc_irq_vectors_affinity+0xb7/0x140
>  qla2x00_request_irqs+0xa6/0x6d0 [qla2xxx]
>  qla2x00_probe_one+0xc2e/0x25f0 [qla2xxx]
>  ? __pm_runtime_resume+0x40/0x80
>  ? trace_hardirqs_on+0xd/0x10
>  ? _raw_spin_unlock_irqrestore+0x4a/0x80
>  pci_device_probe+0x8a/0xf0
>  driver_probe_device+0x1f5/0x450
>  __driver_attach+0xe3/0xf0
>  ? driver_probe_device+0x450/0x450
>  bus_for_each_dev+0x66/0xa0
>  driver_attach+0x1e/0x20
>  bus_add_driver+0x200/0x270
>  ? 0xffffffffa04eb000
>  driver_register+0x60/0xe0
>  ? 0xffffffffa04eb000
>  __pci_register_driver+0x5d/0x60
>  qla2x00_module_init+0x1c9/0x217 [qla2xxx]
>  do_one_initcall+0x44/0x180
>  ? rcu_read_lock_sched_held+0x72/0x80
>  ? kmem_cache_alloc_trace+0x25b/0x2c0
>  ? do_init_module+0x27/0x1f9
>  do_init_module+0x5f/0x1f9
>  load_module+0x2582/0x2a00
>  ? __symbol_put+0x70/0x70
>  ? kernel_read_file+0x10a/0x1a0
>  ? kernel_read_file_from_fd+0x49/0x80
>  SYSC_finit_module+0xbc/0xf0
>  SyS_finit_module+0xe/0x10
>  entry_SYSCALL_64_fastpath+0x23/0xc6
> RIP: 0033:0x7f05711388e9
> RSP: 002b:00007fff51d4a0f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
> RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f05711388e9
> RDX: 0000000000000000 RSI: 000055c17ab4f720 RDI: 0000000000000004
> RBP: 00007fff51d49100 R08: 0000000000000000 R09: 0000000000000019
> R10: 0000000000000004 R11: 0000000000000246 R12: 000055c17ab4f570
> R13: 00007fff51d490e0 R14: 0000000000000005 R15: 0000000000040000
> FIX kmalloc-16: Restoring 0xffff88006ff18dd8-0xffff88006ff18ddf=0xcc
> 
> FIX kmalloc-16: Object at 0xffff88006ff18dc8 not freed
> scsi host3: qla2xxx
> qla2xxx [0000:00:09.0]-500a:2: LOOP UP detected (4 Gbps).
> qla2xxx [0000:00:0a.0]-500a:3: LOOP UP detected (4 Gbps).
> 
> 
> (gdb) list *(__pci_enable_msix+0x314)
> 0xffffffff8131aa74 is in __pci_enable_msix (drivers/pci/msi.c:702).
> 697             struct msi_desc *entry;
> 698             int ret, i;
> 699
> 700             if (affd) {
> 701                     masks = irq_create_affinity_masks(nvec, affd);
> 702                     if (!masks)
> 703                             pr_err("Unable to allocate affinity masks, ignoring\n");
> 704             }
> 705
> 706             for (i = 0, curmsk = masks; i < nvec; i++) {
> (gdb) list *(__pci_enable_msix+0x41c)
> 0xffffffff8131ab7c is in __pci_enable_msix (drivers/pci/msi.c:783).
> 778
> 779             ret = msix_setup_entries(dev, base, entries, nvec, affd);
> 780             if (ret)
> 781                     return ret;
> 782
> 783             ret = pci_msi_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSIX);
> 784             if (ret)
> 785                     goto out_avail;
> 786
> 787             /* Check if all MSI entries honor device restrictions */
> (gdb) quit---end quoted text---

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] qla2xxx: Fix a recently introduced memory leak
  2017-01-26 14:36     ` hch
@ 2017-01-29  5:17       ` Bart Van Assche
  2017-01-29  9:07         ` hch
  0 siblings, 1 reply; 25+ messages in thread
From: Bart Van Assche @ 2017-01-29  5:17 UTC (permalink / raw)
  To: hch; +Cc: linux-scsi

On Thu, 2017-01-26 at 15:36 +0100, hch@lst.de wrote:
> On Wed, Jan 25, 2017 at 03:47:20PM +0000, Bart Van Assche wrote:
> > =============================================================================
> > BUG kmalloc-16 (Not tainted): Redzone overwritten
> > -----------------------------------------------------------------------------
> > 
> > Disabling lock debugging due to kernel taint
> > INFO: 0xffff880030bacc78-0xffff880030bacc7f. First byte 0xf instead of 0xcc
> > INFO: Allocated in irq_create_affinity_masks+0x5f/0x260 age=0 cpu=3 pid=812
> > 	___slab_alloc.constprop.79+0x482/0x4f0
> > 	__slab_alloc.isra.75.constprop.78+0x55/0xa0
> > 	__kmalloc+0x27c/0x310
> > 	irq_create_affinity_masks+0x5f/0x260
> 
> This is the normal affinity mask allocation.

(reduced CC-list again)

Hello Christoph,

It seems like irq_create_affinity_masks() wrote past the bounds of the masks array
it allocated. After I had added the following debug code in irq_create_affinity_masks():

        WARN_ON_ONCE(affv <= 0);
        pr_err("%s: affd = { .pre = %d, .post = %d }, nvecs = %d\n",
                __func__, affd->pre_vectors, affd->post_vectors, nvecs);

The following output appeared:

WARNING: CPU: 0 PID: 814 at kernel/irq/affinity.c:69 irq_create_affinity_masks+0x2cd/0x2f0
Call Trace:
 dump_stack+0x85/0xc2
 __warn+0xcb/0xf0
 warn_slowpath_null+0x1d/0x20
 irq_create_affinity_masks+0x2cd/0x2f0
 __pci_enable_msix+0x314/0x4c0
 pci_alloc_irq_vectors_affinity+0xb7/0x140
 qla2x00_request_irqs+0xa6/0x6d0 [qla2xxx]
 qla2x00_probe_one+0xc2e/0x25f0 [qla2xxx]
 pci_device_probe+0x8a/0xf0
 driver_probe_device+0x1f5/0x450
 __driver_attach+0xe3/0xf0
 bus_for_each_dev+0x66/0xa0
 driver_attach+0x1e/0x20
 bus_add_driver+0x200/0x270
 driver_register+0x60/0xe0
 __pci_register_driver+0x5d/0x60
 qla2x00_module_init+0x1c9/0x217 [qla2xxx]
 do_one_initcall+0x44/0x180
 do_init_module+0x5f/0x1f9
 load_module+0x2582/0x2a00
 SYSC_finit_module+0xbc/0xf0
 SyS_finit_module+0xe/0x10
 entry_SYSCALL_64_fastpath+0x23/0xc6

irq_create_affinity_masks: affd = { .pre = 2, .post = 0 }, nvecs = 2

affd comes from the qla2xxx driver: struct irq_affinity desc = { .pre_vectors =
QLA_BASE_VECTORS }. Shouldn't irq_calc_affinity_vectors() guarantee that it
returns a value that is strictly greater than affd->pre_vectors + affd->post_vectors
instead of greater than or equal to affd->pre_vectors + affd->post_vectors?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] qla2xxx: Fix a recently introduced memory leak
  2017-01-29  5:17       ` Bart Van Assche
@ 2017-01-29  9:07         ` hch
  2017-01-29 17:14           ` Bart Van Assche
  0 siblings, 1 reply; 25+ messages in thread
From: hch @ 2017-01-29  9:07 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: hch, linux-scsi

Hi Bart,

thanks for the analysis.  I think the main issue here is that we shouldn't
try to assign affinity masks if there are not affinity vectors left to
assign.

Does the patch below fix the issue for you?

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 50c5003..96c0713 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -1206,6 +1206,11 @@ int pci_alloc_irq_vectors_affinity(struct pci_dev *dev, unsigned int min_vecs,
 	if (flags & PCI_IRQ_AFFINITY) {
 		if (!affd)
 			affd = &msi_default_affd;
+
+		if (affd->pre_vectors + affd->post_vectors > min_vecs)
+			return -EINVAL;
+		if (affd->pre_vectors + affd->post_vectors == min_vecs)
+			flags &=~ PCI_IRQ_AFFINITY;
 	} else {
 		if (WARN_ON(affd))
 			affd = NULL;

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] qla2xxx: Fix a recently introduced memory leak
  2017-01-29  9:07         ` hch
@ 2017-01-29 17:14           ` Bart Van Assche
  0 siblings, 0 replies; 25+ messages in thread
From: Bart Van Assche @ 2017-01-29 17:14 UTC (permalink / raw)
  To: hch; +Cc: linux-scsi

On Sun, 2017-01-29 at 10:07 +0100, hch@lst.de wrote:
> Hi Bart,
> 
> thanks for the analysis.  I think the main issue here is that we shouldn't
> try to assign affinity masks if there are not affinity vectors left to
> assign.
> 
> Does the patch below fix the issue for you?
> 
> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
> index 50c5003..96c0713 100644
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -1206,6 +1206,11 @@ int pci_alloc_irq_vectors_affinity(struct pci_dev *dev, unsigned int min_vecs,
>  	if (flags & PCI_IRQ_AFFINITY) {
>  		if (!affd)
>  			affd = &msi_default_affd;
> +
> +		if (affd->pre_vectors + affd->post_vectors > min_vecs)
> +			return -EINVAL;
> +		if (affd->pre_vectors + affd->post_vectors == min_vecs)
> +			flags &=~ PCI_IRQ_AFFINITY;
>  	} else {
>  		if (WARN_ON(affd))
>  			affd = NULL;

Hello Christoph,

Thanks for the quick reply. Sorry but the above patch didn't help - the
PCI_IRQ_AFFINITY isn't tested after the above code clears it. But after I
changed "flags &=~ PCI_IRQ_AFFINITY" into "affd = NULL" the memory
corruption complaint was gone and I/O through the qla2xxx adapter
was still working fine. If you want you can add the following to the above
patch with the aforementioned change:

Tested-by: Bart Van Assche <bart.vanassche@sandisk.com>

Bart.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] qla2xxx: Fix a recently introduced memory leak
  2017-01-25 23:28     ` Martin K. Petersen
  (?)
@ 2017-02-03 16:59     ` Bart Van Assche
  2017-02-07  0:23       ` Martin K. Petersen
  -1 siblings, 1 reply; 25+ messages in thread
From: Bart Van Assche @ 2017-02-03 16:59 UTC (permalink / raw)
  To: martin.petersen; +Cc: linux-scsi, michael.hernandez, himanshu.madhani, hch

On Wed, 2017-01-25 at 18:28 -0500, Martin K. Petersen wrote:
> > > > > > "Bart" == Bart Van Assche <bart.vanassche@sandisk.com> writes:
> 
> Bart> qla2x00_probe_one() allocates IRQs before it initializes rsp_q_map
> Bart> so IRQs must be freed even if rsp_q_map allocation did not occur.
> Bart> This was detected by kmemleak.
> 
> I queued this one yesterday but was waiting for a resolution on patch
> 2...

Hello Martin,

Since there is now a resolution for patch 2: have you already decided when
to submit these patches upstream?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] qla2xxx: Fix a recently introduced memory leak
  2017-02-03 16:59     ` Bart Van Assche
@ 2017-02-07  0:23       ` Martin K. Petersen
  0 siblings, 0 replies; 25+ messages in thread
From: Martin K. Petersen @ 2017-02-07  0:23 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: martin.petersen, linux-scsi, michael.hernandez, himanshu.madhani, hch

>>>>> "Bart" == Bart Van Assche <Bart.VanAssche@sandisk.com> writes:

Hi Bart,

Bart> Since there is now a resolution for patch 2: have you already
Bart> decided when to submit these patches upstream?

I queued it in scsi-fixes last week:

commit 2780f3c8f0233de90b6b47a23fc422b7780c5436
Author:     Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
AuthorDate: Wed Jan 25 22:07:06 2017 -0200
Commit:     Martin K. Petersen <martin.petersen@oracle.com>
CommitDate: Tue Jan 31 22:25:32 2017 -0500

    scsi: qla2xxx: Avoid that issuing a LIP triggers a kernel crash

[...]

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2017-02-07  0:23 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-23 16:34 [PATCH 0/2] qla2xxx: Two bug fixes Bart Van Assche
2017-01-23 16:34 ` [PATCH 1/2] qla2xxx: Fix a recently introduced memory leak Bart Van Assche
2017-01-23 16:34   ` Bart Van Assche
2017-01-23 16:45   ` Christoph Hellwig
2017-01-23 17:04   ` Madhani, Himanshu
2017-01-24 12:10   ` Johannes Thumshirn
2017-01-24 12:10     ` Johannes Thumshirn
2017-01-25 15:47   ` Bart Van Assche
2017-01-26 14:36     ` hch
2017-01-29  5:17       ` Bart Van Assche
2017-01-29  9:07         ` hch
2017-01-29 17:14           ` Bart Van Assche
2017-01-25 23:28   ` Martin K. Petersen
2017-01-25 23:28     ` Martin K. Petersen
2017-02-03 16:59     ` Bart Van Assche
2017-02-07  0:23       ` Martin K. Petersen
2017-01-23 16:34 ` [PATCH 2/2] qla2xxx: Avoid that issuing a LIP triggers a kernel crash Bart Van Assche
2017-01-23 16:34   ` Bart Van Assche
2017-01-23 17:41   ` Madhani, Himanshu
2017-01-24 12:12   ` Johannes Thumshirn
2017-01-24 12:12     ` Johannes Thumshirn
2017-01-24 14:59   ` Mauricio Faria de Oliveira
2017-01-25 22:05     ` Madhani, Himanshu
2017-01-25 23:29     ` Martin K. Petersen
2017-01-26  0:09       ` Mauricio Faria de Oliveira

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.