All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] lpfc: fix oops in lpfc_sli4_scmd_to_wqidx_distr() from lpfc_send_taskmgmt()
@ 2016-06-07 23:13 Mauricio Faria de Oliveira
  2016-07-21 21:39 ` James Smart
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Mauricio Faria de Oliveira @ 2016-06-07 23:13 UTC (permalink / raw)
  To: linux-scsi; +Cc: james.smart, dick.kennedy

The lpfc_sli4_scmd_to_wqidx_distr() function expects the scsi_cmnd
'lpfc_cmd->pCmd' not to be null, and point to the midlayer command.

That's not true in the .eh_(device|target|bus)_reset_handler path,
because lpfc_send_taskmgmt() sends commands not from the midlayer,
so does not set 'lpfc_cmd->pCmd'.

That is true in the .queuecommand path because lpfc_queuecommand()
stores the scsi_cmnd from midlayer in lpfc_cmd->pCmd; and lpfc_cmd
is stored by lpfc_scsi_prep_cmnd() in piocbq->context1 -- which is
passed to lpfc_sli4_scmd_to_wqidx_distr() as lpfc_cmd parameter.

This problem can be hit on SCSI EH, and immediately with sg_reset.
These 2 test-cases demonstrate the problem/fix with next-20160601.

Test-case 1) sg_reset

    # strace sg_reset --device /dev/sdm
    <...>
    open("/dev/sdm", O_RDWR|O_NONBLOCK)     = 3
    ioctl(3, SG_SCSI_RESET, 0x3fffde6d0994 <unfinished ...>
    +++ killed by SIGSEGV +++
    Segmentation fault

    # dmesg
    Unable to handle kernel paging request for data at address 0x00000000
    Faulting instruction address: 0xd00000001c88442c
    Oops: Kernel access of bad area, sig: 11 [#1]
    <...>
    CPU: 104 PID: 16333 Comm: sg_reset Tainted: G        W       4.7.0-rc1-next-20160601-00004-g95b89dc #6
    <...>
    NIP [d00000001c88442c] lpfc_sli4_scmd_to_wqidx_distr+0xc/0xd0 [lpfc]
    LR [d00000001c826fe8] lpfc_sli_calc_ring.part.27+0x98/0xd0 [lpfc]
    Call Trace:
    [c000003c9ec876f0] [c000003c9ec87770] 0xc000003c9ec87770 (unreliable)
    [c000003c9ec87720] [d00000001c82e004] lpfc_sli_issue_iocb+0xd4/0x260 [lpfc]
    [c000003c9ec87780] [d00000001c831a3c] lpfc_sli_issue_iocb_wait+0x15c/0x5b0 [lpfc]
    [c000003c9ec87880] [d00000001c87f27c] lpfc_send_taskmgmt+0x24c/0x650 [lpfc]
    [c000003c9ec87950] [d00000001c87fd7c] lpfc_device_reset_handler+0x10c/0x200 [lpfc]
    [c000003c9ec87a10] [c000000000610694] scsi_try_bus_device_reset+0x44/0xc0
    [c000003c9ec87a40] [c0000000006113e8] scsi_ioctl_reset+0x198/0x2c0
    [c000003c9ec87bf0] [c00000000060fe5c] scsi_ioctl+0x13c/0x4b0
    [c000003c9ec87c80] [c0000000006629b0] sd_ioctl+0xf0/0x120
    [c000003c9ec87cd0] [c00000000046e4f8] blkdev_ioctl+0x248/0xb70
    [c000003c9ec87d30] [c0000000002a1f60] block_ioctl+0x70/0x90
    [c000003c9ec87d50] [c00000000026d334] do_vfs_ioctl+0xc4/0x890
    [c000003c9ec87de0] [c00000000026db60] SyS_ioctl+0x60/0xc0
    [c000003c9ec87e30] [c000000000009120] system_call+0x38/0x108
    Instruction dump:
    <...>

    With fix:

    # strace sg_reset --device /dev/sdm
    <...>
    open("/dev/sdm", O_RDWR|O_NONBLOCK)     = 3
    ioctl(3, SG_SCSI_RESET, 0x3fffe103c554) = 0
    close(3)                                = 0
    exit_group(0)                           = ?
    +++ exited with 0 +++

    # dmesg
    [  424.658649] lpfc 0006:01:00.4: 4:(0):0713 SCSI layer issued Device Reset (1, 0) return x2002

Test-case 2) SCSI EH

    Using this debug patch to wire an SCSI EH trigger, for lpfc_scsi_cmd_iocb_cmpl():
    -       cmd->scsi_done(cmd);
    +       if ((phba->pport ? phba->pport->cfg_log_verbose : phba->cfg_log_verbose) == 0x32100000)
    +               printk(KERN_ALERT "lpfc: skip scsi_done()\n");
    +       else
    +               cmd->scsi_done(cmd);

    # echo 0x32100000 > /sys/class/scsi_host/host11/lpfc_log_verbose

    # dd if=/dev/sdm of=/dev/null iflag=direct &
    <...>

    After a while:

    # dmesg
    lpfc 0006:01:00.4: 4:(0):3053 lpfc_log_verbose changed from 0 (x0) to 839909376 (x32100000)
    lpfc: skip scsi_done()
    <...>
    Unable to handle kernel paging request for data at address 0x00000000
    Faulting instruction address: 0xd0000000199e448c
    Oops: Kernel access of bad area, sig: 11 [#1]
    <...>
    CPU: 96 PID: 28556 Comm: scsi_eh_11 Tainted: G        W       4.7.0-rc1-next-20160601-00004-g95b89dc #6
    <...>
    NIP [d0000000199e448c] lpfc_sli4_scmd_to_wqidx_distr+0xc/0xd0 [lpfc]
    LR [d000000019986fe8] lpfc_sli_calc_ring.part.27+0x98/0xd0 [lpfc]
    Call Trace:
    [c000000ff0d0b890] [c000000ff0d0b900] 0xc000000ff0d0b900 (unreliable)
    [c000000ff0d0b8c0] [d00000001998e004] lpfc_sli_issue_iocb+0xd4/0x260 [lpfc]
    [c000000ff0d0b920] [d000000019991a3c] lpfc_sli_issue_iocb_wait+0x15c/0x5b0 [lpfc]
    [c000000ff0d0ba20] [d0000000199df27c] lpfc_send_taskmgmt+0x24c/0x650 [lpfc]
    [c000000ff0d0baf0] [d0000000199dfd7c] lpfc_device_reset_handler+0x10c/0x200 [lpfc]
    [c000000ff0d0bbb0] [c000000000610694] scsi_try_bus_device_reset+0x44/0xc0
    [c000000ff0d0bbe0] [c0000000006126cc] scsi_eh_ready_devs+0x49c/0x9c0
    [c000000ff0d0bcb0] [c000000000614160] scsi_error_handler+0x580/0x680
    [c000000ff0d0bd80] [c0000000000ae848] kthread+0x108/0x130
    [c000000ff0d0be30] [c0000000000094a8] ret_from_kernel_thread+0x5c/0xb4
    Instruction dump:
    <...>

    With fix:

    # dmesg
    lpfc 0006:01:00.4: 4:(0):3053 lpfc_log_verbose changed from 0 (x0) to 839909376 (x32100000)
    lpfc: skip scsi_done()
    <...>
    lpfc 0006:01:00.4: 4:(0):0713 SCSI layer issued Device Reset (0, 0) return x2002
    <...>
    lpfc 0006:01:00.4: 4:(0):0723 SCSI layer issued Target Reset (1, 0) return x2002
    <...>
    lpfc 0006:01:00.4: 4:(0):0714 SCSI layer issued Bus Reset Data: x2002
    <...>
    lpfc 0006:01:00.4: 4:(0):3172 SCSI layer issued Host Reset Data:
    <...>

Fixes: 8b0dff14164d ("lpfc: Add support for using block multi-queue")
Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
---
 drivers/scsi/lpfc/lpfc_scsi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c
index 3bd0be6..c7e5695 100644
--- a/drivers/scsi/lpfc/lpfc_scsi.c
+++ b/drivers/scsi/lpfc/lpfc_scsi.c
@@ -3874,7 +3874,7 @@ int lpfc_sli4_scmd_to_wqidx_distr(struct lpfc_hba *phba,
 	uint32_t tag;
 	uint16_t hwq;
 
-	if (shost_use_blk_mq(cmnd->device->host)) {
+	if (cmnd && shost_use_blk_mq(cmnd->device->host)) {
 		tag = blk_mq_unique_tag(cmnd->request);
 		hwq = blk_mq_unique_tag_to_hwq(tag);
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] lpfc: fix oops in lpfc_sli4_scmd_to_wqidx_distr() from lpfc_send_taskmgmt()
  2016-06-07 23:13 [PATCH] lpfc: fix oops in lpfc_sli4_scmd_to_wqidx_distr() from lpfc_send_taskmgmt() Mauricio Faria de Oliveira
@ 2016-07-21 21:39 ` James Smart
  2016-07-22  8:16 ` Johannes Thumshirn
  2016-07-22 20:18 ` Martin K. Petersen
  2 siblings, 0 replies; 6+ messages in thread
From: James Smart @ 2016-07-21 21:39 UTC (permalink / raw)
  To: Mauricio Faria de Oliveira, linux-scsi; +Cc: dick.kennedy, James Smart

Looks good.

Signed-off-by:   James Smart   <james.smart@broadcom.com>

-- james




On 6/7/2016 4:13 PM, Mauricio Faria de Oliveira wrote:
> The lpfc_sli4_scmd_to_wqidx_distr() function expects the scsi_cmnd
> 'lpfc_cmd->pCmd' not to be null, and point to the midlayer command.
>
> That's not true in the .eh_(device|target|bus)_reset_handler path,
> because lpfc_send_taskmgmt() sends commands not from the midlayer,
> so does not set 'lpfc_cmd->pCmd'.
>
> That is true in the .queuecommand path because lpfc_queuecommand()
> stores the scsi_cmnd from midlayer in lpfc_cmd->pCmd; and lpfc_cmd
> is stored by lpfc_scsi_prep_cmnd() in piocbq->context1 -- which is
> passed to lpfc_sli4_scmd_to_wqidx_distr() as lpfc_cmd parameter.
>
> This problem can be hit on SCSI EH, and immediately with sg_reset.
> These 2 test-cases demonstrate the problem/fix with next-20160601.
>
> Test-case 1) sg_reset
>
>      # strace sg_reset --device /dev/sdm
>      <...>
>      open("/dev/sdm", O_RDWR|O_NONBLOCK)     = 3
>      ioctl(3, SG_SCSI_RESET, 0x3fffde6d0994 <unfinished ...>
>      +++ killed by SIGSEGV +++
>      Segmentation fault
>
>      # dmesg
>      Unable to handle kernel paging request for data at address 0x00000000
>      Faulting instruction address: 0xd00000001c88442c
>      Oops: Kernel access of bad area, sig: 11 [#1]
>      <...>
>      CPU: 104 PID: 16333 Comm: sg_reset Tainted: G        W       4.7.0-rc1-next-20160601-00004-g95b89dc #6
>      <...>
>      NIP [d00000001c88442c] lpfc_sli4_scmd_to_wqidx_distr+0xc/0xd0 [lpfc]
>      LR [d00000001c826fe8] lpfc_sli_calc_ring.part.27+0x98/0xd0 [lpfc]
>      Call Trace:
>      [c000003c9ec876f0] [c000003c9ec87770] 0xc000003c9ec87770 (unreliable)
>      [c000003c9ec87720] [d00000001c82e004] lpfc_sli_issue_iocb+0xd4/0x260 [lpfc]
>      [c000003c9ec87780] [d00000001c831a3c] lpfc_sli_issue_iocb_wait+0x15c/0x5b0 [lpfc]
>      [c000003c9ec87880] [d00000001c87f27c] lpfc_send_taskmgmt+0x24c/0x650 [lpfc]
>      [c000003c9ec87950] [d00000001c87fd7c] lpfc_device_reset_handler+0x10c/0x200 [lpfc]
>      [c000003c9ec87a10] [c000000000610694] scsi_try_bus_device_reset+0x44/0xc0
>      [c000003c9ec87a40] [c0000000006113e8] scsi_ioctl_reset+0x198/0x2c0
>      [c000003c9ec87bf0] [c00000000060fe5c] scsi_ioctl+0x13c/0x4b0
>      [c000003c9ec87c80] [c0000000006629b0] sd_ioctl+0xf0/0x120
>      [c000003c9ec87cd0] [c00000000046e4f8] blkdev_ioctl+0x248/0xb70
>      [c000003c9ec87d30] [c0000000002a1f60] block_ioctl+0x70/0x90
>      [c000003c9ec87d50] [c00000000026d334] do_vfs_ioctl+0xc4/0x890
>      [c000003c9ec87de0] [c00000000026db60] SyS_ioctl+0x60/0xc0
>      [c000003c9ec87e30] [c000000000009120] system_call+0x38/0x108
>      Instruction dump:
>      <...>
>
>      With fix:
>
>      # strace sg_reset --device /dev/sdm
>      <...>
>      open("/dev/sdm", O_RDWR|O_NONBLOCK)     = 3
>      ioctl(3, SG_SCSI_RESET, 0x3fffe103c554) = 0
>      close(3)                                = 0
>      exit_group(0)                           = ?
>      +++ exited with 0 +++
>
>      # dmesg
>      [  424.658649] lpfc 0006:01:00.4: 4:(0):0713 SCSI layer issued Device Reset (1, 0) return x2002
>
> Test-case 2) SCSI EH
>
>      Using this debug patch to wire an SCSI EH trigger, for lpfc_scsi_cmd_iocb_cmpl():
>      -       cmd->scsi_done(cmd);
>      +       if ((phba->pport ? phba->pport->cfg_log_verbose : phba->cfg_log_verbose) == 0x32100000)
>      +               printk(KERN_ALERT "lpfc: skip scsi_done()\n");
>      +       else
>      +               cmd->scsi_done(cmd);
>
>      # echo 0x32100000 > /sys/class/scsi_host/host11/lpfc_log_verbose
>
>      # dd if=/dev/sdm of=/dev/null iflag=direct &
>      <...>
>
>      After a while:
>
>      # dmesg
>      lpfc 0006:01:00.4: 4:(0):3053 lpfc_log_verbose changed from 0 (x0) to 839909376 (x32100000)
>      lpfc: skip scsi_done()
>      <...>
>      Unable to handle kernel paging request for data at address 0x00000000
>      Faulting instruction address: 0xd0000000199e448c
>      Oops: Kernel access of bad area, sig: 11 [#1]
>      <...>
>      CPU: 96 PID: 28556 Comm: scsi_eh_11 Tainted: G        W       4.7.0-rc1-next-20160601-00004-g95b89dc #6
>      <...>
>      NIP [d0000000199e448c] lpfc_sli4_scmd_to_wqidx_distr+0xc/0xd0 [lpfc]
>      LR [d000000019986fe8] lpfc_sli_calc_ring.part.27+0x98/0xd0 [lpfc]
>      Call Trace:
>      [c000000ff0d0b890] [c000000ff0d0b900] 0xc000000ff0d0b900 (unreliable)
>      [c000000ff0d0b8c0] [d00000001998e004] lpfc_sli_issue_iocb+0xd4/0x260 [lpfc]
>      [c000000ff0d0b920] [d000000019991a3c] lpfc_sli_issue_iocb_wait+0x15c/0x5b0 [lpfc]
>      [c000000ff0d0ba20] [d0000000199df27c] lpfc_send_taskmgmt+0x24c/0x650 [lpfc]
>      [c000000ff0d0baf0] [d0000000199dfd7c] lpfc_device_reset_handler+0x10c/0x200 [lpfc]
>      [c000000ff0d0bbb0] [c000000000610694] scsi_try_bus_device_reset+0x44/0xc0
>      [c000000ff0d0bbe0] [c0000000006126cc] scsi_eh_ready_devs+0x49c/0x9c0
>      [c000000ff0d0bcb0] [c000000000614160] scsi_error_handler+0x580/0x680
>      [c000000ff0d0bd80] [c0000000000ae848] kthread+0x108/0x130
>      [c000000ff0d0be30] [c0000000000094a8] ret_from_kernel_thread+0x5c/0xb4
>      Instruction dump:
>      <...>
>
>      With fix:
>
>      # dmesg
>      lpfc 0006:01:00.4: 4:(0):3053 lpfc_log_verbose changed from 0 (x0) to 839909376 (x32100000)
>      lpfc: skip scsi_done()
>      <...>
>      lpfc 0006:01:00.4: 4:(0):0713 SCSI layer issued Device Reset (0, 0) return x2002
>      <...>
>      lpfc 0006:01:00.4: 4:(0):0723 SCSI layer issued Target Reset (1, 0) return x2002
>      <...>
>      lpfc 0006:01:00.4: 4:(0):0714 SCSI layer issued Bus Reset Data: x2002
>      <...>
>      lpfc 0006:01:00.4: 4:(0):3172 SCSI layer issued Host Reset Data:
>      <...>
>
> Fixes: 8b0dff14164d ("lpfc: Add support for using block multi-queue")
> Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
> ---
>   drivers/scsi/lpfc/lpfc_scsi.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c
> index 3bd0be6..c7e5695 100644
> --- a/drivers/scsi/lpfc/lpfc_scsi.c
> +++ b/drivers/scsi/lpfc/lpfc_scsi.c
> @@ -3874,7 +3874,7 @@ int lpfc_sli4_scmd_to_wqidx_distr(struct lpfc_hba *phba,
>   	uint32_t tag;
>   	uint16_t hwq;
>   
> -	if (shost_use_blk_mq(cmnd->device->host)) {
> +	if (cmnd && shost_use_blk_mq(cmnd->device->host)) {
>   		tag = blk_mq_unique_tag(cmnd->request);
>   		hwq = blk_mq_unique_tag_to_hwq(tag);
>   


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] lpfc: fix oops in lpfc_sli4_scmd_to_wqidx_distr() from lpfc_send_taskmgmt()
  2016-06-07 23:13 [PATCH] lpfc: fix oops in lpfc_sli4_scmd_to_wqidx_distr() from lpfc_send_taskmgmt() Mauricio Faria de Oliveira
  2016-07-21 21:39 ` James Smart
@ 2016-07-22  8:16 ` Johannes Thumshirn
  2016-07-22 20:18 ` Martin K. Petersen
  2 siblings, 0 replies; 6+ messages in thread
From: Johannes Thumshirn @ 2016-07-22  8:16 UTC (permalink / raw)
  To: Mauricio Faria de Oliveira; +Cc: linux-scsi, james.smart, dick.kennedy

On Tue, Jun 07, 2016 at 08:13:08PM -0300, Mauricio Faria de Oliveira wrote:
> The lpfc_sli4_scmd_to_wqidx_distr() function expects the scsi_cmnd
> 'lpfc_cmd->pCmd' not to be null, and point to the midlayer command.
> 
> That's not true in the .eh_(device|target|bus)_reset_handler path,
> because lpfc_send_taskmgmt() sends commands not from the midlayer,
> so does not set 'lpfc_cmd->pCmd'.
> 
> That is true in the .queuecommand path because lpfc_queuecommand()
> stores the scsi_cmnd from midlayer in lpfc_cmd->pCmd; and lpfc_cmd
> is stored by lpfc_scsi_prep_cmnd() in piocbq->context1 -- which is
> passed to lpfc_sli4_scmd_to_wqidx_distr() as lpfc_cmd parameter.
> 
> This problem can be hit on SCSI EH, and immediately with sg_reset.
> These 2 test-cases demonstrate the problem/fix with next-20160601.
> 
> Test-case 1) sg_reset
> 
>     # strace sg_reset --device /dev/sdm
>     <...>
>     open("/dev/sdm", O_RDWR|O_NONBLOCK)     = 3
>     ioctl(3, SG_SCSI_RESET, 0x3fffde6d0994 <unfinished ...>
>     +++ killed by SIGSEGV +++
>     Segmentation fault
> 
>     # dmesg
>     Unable to handle kernel paging request for data at address 0x00000000
>     Faulting instruction address: 0xd00000001c88442c
>     Oops: Kernel access of bad area, sig: 11 [#1]
>     <...>
>     CPU: 104 PID: 16333 Comm: sg_reset Tainted: G        W       4.7.0-rc1-next-20160601-00004-g95b89dc #6
>     <...>
>     NIP [d00000001c88442c] lpfc_sli4_scmd_to_wqidx_distr+0xc/0xd0 [lpfc]
>     LR [d00000001c826fe8] lpfc_sli_calc_ring.part.27+0x98/0xd0 [lpfc]
>     Call Trace:
>     [c000003c9ec876f0] [c000003c9ec87770] 0xc000003c9ec87770 (unreliable)
>     [c000003c9ec87720] [d00000001c82e004] lpfc_sli_issue_iocb+0xd4/0x260 [lpfc]
>     [c000003c9ec87780] [d00000001c831a3c] lpfc_sli_issue_iocb_wait+0x15c/0x5b0 [lpfc]
>     [c000003c9ec87880] [d00000001c87f27c] lpfc_send_taskmgmt+0x24c/0x650 [lpfc]
>     [c000003c9ec87950] [d00000001c87fd7c] lpfc_device_reset_handler+0x10c/0x200 [lpfc]
>     [c000003c9ec87a10] [c000000000610694] scsi_try_bus_device_reset+0x44/0xc0
>     [c000003c9ec87a40] [c0000000006113e8] scsi_ioctl_reset+0x198/0x2c0
>     [c000003c9ec87bf0] [c00000000060fe5c] scsi_ioctl+0x13c/0x4b0
>     [c000003c9ec87c80] [c0000000006629b0] sd_ioctl+0xf0/0x120
>     [c000003c9ec87cd0] [c00000000046e4f8] blkdev_ioctl+0x248/0xb70
>     [c000003c9ec87d30] [c0000000002a1f60] block_ioctl+0x70/0x90
>     [c000003c9ec87d50] [c00000000026d334] do_vfs_ioctl+0xc4/0x890
>     [c000003c9ec87de0] [c00000000026db60] SyS_ioctl+0x60/0xc0
>     [c000003c9ec87e30] [c000000000009120] system_call+0x38/0x108
>     Instruction dump:
>     <...>
> 
>     With fix:
> 
>     # strace sg_reset --device /dev/sdm
>     <...>
>     open("/dev/sdm", O_RDWR|O_NONBLOCK)     = 3
>     ioctl(3, SG_SCSI_RESET, 0x3fffe103c554) = 0
>     close(3)                                = 0
>     exit_group(0)                           = ?
>     +++ exited with 0 +++
> 
>     # dmesg
>     [  424.658649] lpfc 0006:01:00.4: 4:(0):0713 SCSI layer issued Device Reset (1, 0) return x2002
> 
> Test-case 2) SCSI EH
> 
>     Using this debug patch to wire an SCSI EH trigger, for lpfc_scsi_cmd_iocb_cmpl():
>     -       cmd->scsi_done(cmd);
>     +       if ((phba->pport ? phba->pport->cfg_log_verbose : phba->cfg_log_verbose) == 0x32100000)
>     +               printk(KERN_ALERT "lpfc: skip scsi_done()\n");
>     +       else
>     +               cmd->scsi_done(cmd);
> 
>     # echo 0x32100000 > /sys/class/scsi_host/host11/lpfc_log_verbose
> 
>     # dd if=/dev/sdm of=/dev/null iflag=direct &
>     <...>
> 
>     After a while:
> 
>     # dmesg
>     lpfc 0006:01:00.4: 4:(0):3053 lpfc_log_verbose changed from 0 (x0) to 839909376 (x32100000)
>     lpfc: skip scsi_done()
>     <...>
>     Unable to handle kernel paging request for data at address 0x00000000
>     Faulting instruction address: 0xd0000000199e448c
>     Oops: Kernel access of bad area, sig: 11 [#1]
>     <...>
>     CPU: 96 PID: 28556 Comm: scsi_eh_11 Tainted: G        W       4.7.0-rc1-next-20160601-00004-g95b89dc #6
>     <...>
>     NIP [d0000000199e448c] lpfc_sli4_scmd_to_wqidx_distr+0xc/0xd0 [lpfc]
>     LR [d000000019986fe8] lpfc_sli_calc_ring.part.27+0x98/0xd0 [lpfc]
>     Call Trace:
>     [c000000ff0d0b890] [c000000ff0d0b900] 0xc000000ff0d0b900 (unreliable)
>     [c000000ff0d0b8c0] [d00000001998e004] lpfc_sli_issue_iocb+0xd4/0x260 [lpfc]
>     [c000000ff0d0b920] [d000000019991a3c] lpfc_sli_issue_iocb_wait+0x15c/0x5b0 [lpfc]
>     [c000000ff0d0ba20] [d0000000199df27c] lpfc_send_taskmgmt+0x24c/0x650 [lpfc]
>     [c000000ff0d0baf0] [d0000000199dfd7c] lpfc_device_reset_handler+0x10c/0x200 [lpfc]
>     [c000000ff0d0bbb0] [c000000000610694] scsi_try_bus_device_reset+0x44/0xc0
>     [c000000ff0d0bbe0] [c0000000006126cc] scsi_eh_ready_devs+0x49c/0x9c0
>     [c000000ff0d0bcb0] [c000000000614160] scsi_error_handler+0x580/0x680
>     [c000000ff0d0bd80] [c0000000000ae848] kthread+0x108/0x130
>     [c000000ff0d0be30] [c0000000000094a8] ret_from_kernel_thread+0x5c/0xb4
>     Instruction dump:
>     <...>
> 
>     With fix:
> 
>     # dmesg
>     lpfc 0006:01:00.4: 4:(0):3053 lpfc_log_verbose changed from 0 (x0) to 839909376 (x32100000)
>     lpfc: skip scsi_done()
>     <...>
>     lpfc 0006:01:00.4: 4:(0):0713 SCSI layer issued Device Reset (0, 0) return x2002
>     <...>
>     lpfc 0006:01:00.4: 4:(0):0723 SCSI layer issued Target Reset (1, 0) return x2002
>     <...>
>     lpfc 0006:01:00.4: 4:(0):0714 SCSI layer issued Bus Reset Data: x2002
>     <...>
>     lpfc 0006:01:00.4: 4:(0):3172 SCSI layer issued Host Reset Data:
>     <...>
> 
> Fixes: 8b0dff14164d ("lpfc: Add support for using block multi-queue")
> Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>

Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>

-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] lpfc: fix oops in lpfc_sli4_scmd_to_wqidx_distr() from lpfc_send_taskmgmt()
  2016-06-07 23:13 [PATCH] lpfc: fix oops in lpfc_sli4_scmd_to_wqidx_distr() from lpfc_send_taskmgmt() Mauricio Faria de Oliveira
  2016-07-21 21:39 ` James Smart
  2016-07-22  8:16 ` Johannes Thumshirn
@ 2016-07-22 20:18 ` Martin K. Petersen
  2016-07-25 15:50   ` Mauricio Faria de Oliveira
  2 siblings, 1 reply; 6+ messages in thread
From: Martin K. Petersen @ 2016-07-22 20:18 UTC (permalink / raw)
  To: Mauricio Faria de Oliveira; +Cc: linux-scsi, james.smart, dick.kennedy

>>>>> "Mauricio" == Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> writes:

Mauricio> The lpfc_sli4_scmd_to_wqidx_distr() function expects the
Mauricio> scsi_cmnd 'lpfc_cmd->pCmd' not to be null, and point to the
Mauricio> midlayer command.

Applied to 4.8/scsi-queue.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] lpfc: fix oops in lpfc_sli4_scmd_to_wqidx_distr() from lpfc_send_taskmgmt()
  2016-07-22 20:18 ` Martin K. Petersen
@ 2016-07-25 15:50   ` Mauricio Faria de Oliveira
  2016-07-27  4:33     ` Martin K. Petersen
  0 siblings, 1 reply; 6+ messages in thread
From: Mauricio Faria de Oliveira @ 2016-07-25 15:50 UTC (permalink / raw)
  To: Martin K. Petersen; +Cc: linux-scsi, james.smart, dick.kennedy

On 07/22/2016 05:18 PM, Martin K. Petersen wrote:
> Applied to 4.8/scsi-queue.

Can this be routed into stable?

The 'Fixes:' commit was introduced in v4.2, and there's longterm v4.4
and stables v4.5 and v4.6 after that.

Given an oops (which may panic depending on configuration) got fixed,
and it happens in normal scenarios (eg SCSI EH), it seems appropriate.

Thanks,

-- 
Mauricio Faria de Oliveira
IBM Linux Technology Center


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] lpfc: fix oops in lpfc_sli4_scmd_to_wqidx_distr() from lpfc_send_taskmgmt()
  2016-07-25 15:50   ` Mauricio Faria de Oliveira
@ 2016-07-27  4:33     ` Martin K. Petersen
  0 siblings, 0 replies; 6+ messages in thread
From: Martin K. Petersen @ 2016-07-27  4:33 UTC (permalink / raw)
  To: Mauricio Faria de Oliveira
  Cc: Martin K. Petersen, linux-scsi, james.smart, dick.kennedy

>>>>> "Mauricio" == Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> writes:

Mauricio> Can this be routed into stable?

Added a stable tag.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-07-27  4:37 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-07 23:13 [PATCH] lpfc: fix oops in lpfc_sli4_scmd_to_wqidx_distr() from lpfc_send_taskmgmt() Mauricio Faria de Oliveira
2016-07-21 21:39 ` James Smart
2016-07-22  8:16 ` Johannes Thumshirn
2016-07-22 20:18 ` Martin K. Petersen
2016-07-25 15:50   ` Mauricio Faria de Oliveira
2016-07-27  4:33     ` Martin K. Petersen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.