linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bart Van Assche <bvanassche@acm.org>
To: Himanshu Madhani <hmadhani@marvell.com>
Cc: "<James.Bottomley@hansenpartnership.com>" 
	<James.Bottomley@HansenPartnership.com>,
	"Martin K . Petersen" <martin.petersen@oracle.com>,
	"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>
Subject: Re: [PATCH 4/8] qla2xxx: Fix driver unload hang
Date: Thu, 7 Nov 2019 09:58:58 -0800	[thread overview]
Message-ID: <10b38f34-128a-fd71-1542-9025dc107f62@acm.org> (raw)
In-Reply-To: <83CC0DDF-4907-41A2-91EC-1569A07A6BA9@marvell.com>

On 11/7/19 9:55 AM, Himanshu Madhani wrote:
> 
> 
>> On Nov 7, 2019, at 10:54 AM, Bart Van Assche <bvanassche@acm.org
>> <mailto:bvanassche@acm.org>> wrote:
>>
>> On 11/5/19 7:06 AM, Himanshu Madhani wrote:
>>> From: Quinn Tran <qutran@marvell.com <mailto:qutran@marvell.com>>
>>> This patch fixes driver unload hang by removing msleep()
>>> Fixes: d74595278f4ab ("scsi: qla2xxx: Add multiple queue pair
>>> functionality.")
>>> Cc: stable@vger.kernel.org <mailto:stable@vger.kernel.org>
>>> Signed-off-by: Quinn Tran <qutran@marvell.com
>>> <mailto:qutran@marvell.com>>
>>> Signed-off-by: Himanshu Madhani <hmadhani@marvell.com
>>> <mailto:hmadhani@marvell.com>>
>>> ---
>>>  drivers/scsi/qla2xxx/qla_init.c | 2 --
>>>  1 file changed, 2 deletions(-)
>>> diff --git a/drivers/scsi/qla2xxx/qla_init.c
>>> b/drivers/scsi/qla2xxx/qla_init.c
>>> index bddb26baedd2..ff4528702b4e 100644
>>> --- a/drivers/scsi/qla2xxx/qla_init.c
>>> +++ b/drivers/scsi/qla2xxx/qla_init.c
>>> @@ -9009,8 +9009,6 @@ int qla2xxx_delete_qpair(struct scsi_qla_host
>>> *vha, struct qla_qpair *qpair)
>>>  struct qla_hw_data *ha = qpair->hw;
>>>    qpair->delete_in_progress = 1;
>>> -while (atomic_read(&qpair->ref_count))
>>> -msleep(500);
>>>    ret = qla25xx_delete_req_que(vha, qpair->req);
>>>  if (ret != QLA_SUCCESS)
>>
>> I think that an explanation is needed why that loop had been
>> introduced and also why it is safe not to wait until qpair->ref_count
>> drops to zero in qla2xxx_delete_qpair().
>>
> 
> commit d74595278f4ab had drawback in design for MQ implementation in
> qla2xxx. Now that we have been making this more stable with MQ being
> default on for 5x kernel. What we discovered that after heavy IO
> workload in a cluster environment, driver unload encountered hang and
> shows following stack trace
> 
> # ps -fax | grep rmmod
> 6029 pts/0 D+ 0:00 | \_ rmmod qla2xxx
> 
> [<0>] msleep+0x29/0x30 [<0>] qla2xxx_delete_qpair+0x2c/0x160 [qla2xxx]
> [<0>] qla25xx_delete_queues+0x14b/0x1d0 [qla2xxx] [<0>]
> qla2x00_free_device+0x31/0xe0 [qla2xxx] [<0>]
> qla2x00_remove_one+0x239/0x370 [qla2xxx] [<0>]
> pci_device_remove+0x3b/0xc0 [<0>]
> device_release_driver_internal+0x18c/0x250 [<0>] driver_detach+0x39/0x6d
> [<0>] bus_remove_driver+0x77/0xc9 [<0>] pci_unregister_driver+0x2d/0xb0
> [<0>] qla2x00_module_exit+0x2d/0x90 [qla2xxx] [<0>]
> __x64_sys_delete_module+0x139/0x270 [<0>] do_syscall_64+0x5b/0x1b0 [<0>]
> entry_SYSCALL_64_after_hwframe+0x65/0xca [<0>] 0xffffffffffffffff
> 
> Removing this msleep() help resolve this stack trace. 

Hi Himanshu,

Does your answer mean that this hang has not yet been root-caused fully
and hence that it is possible this patch is only a workaround but not a
fix of the root cause?

Thanks,

Bart.


  parent reply	other threads:[~2019-11-07 17:59 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-05 15:06 [PATCH 0/8] qla2xxx: Bug Fixes for the driver Himanshu Madhani
2019-11-05 15:06 ` [PATCH 1/8] qla2xxx: Retry PLOGI on FC-NVMe PRLI failure Himanshu Madhani
2019-11-05 15:18   ` Ewan D. Milne
2019-11-05 15:06 ` [PATCH 2/8] qla2xxx: Do command completion on abort timeout Himanshu Madhani
2019-11-05 15:18   ` Ewan D. Milne
2019-11-05 16:57   ` Bart Van Assche
2019-11-07 16:46     ` [EXT] " Himanshu Madhani
2019-11-05 15:06 ` [PATCH 3/8] qla2xxx: Fix SRB leak on switch command timeout Himanshu Madhani
2019-11-05 15:19   ` Ewan D. Milne
2019-11-05 15:06 ` [PATCH 4/8] qla2xxx: Fix driver unload hang Himanshu Madhani
2019-11-05 15:19   ` Ewan D. Milne
2019-11-07 16:54   ` Bart Van Assche
     [not found]     ` <83CC0DDF-4907-41A2-91EC-1569A07A6BA9@marvell.com>
2019-11-07 17:58       ` Bart Van Assche [this message]
2019-11-07 18:30         ` Bart Van Assche
2019-11-08 23:38           ` [EXT] " Himanshu Madhani
2019-11-08 23:58             ` Bart Van Assche
2019-11-05 15:06 ` [PATCH 5/8] qla2xxx: Fix double scsi_done for abort path Himanshu Madhani
2019-11-05 15:20   ` Ewan D. Milne
2019-11-07 18:01   ` Bart Van Assche
2019-11-05 15:06 ` [PATCH 6/8] qla2xxx: Fix memory leak when sending I/O fails Himanshu Madhani
2019-11-05 15:20   ` Ewan D. Milne
2019-11-18 20:25   ` Himanshu Madhani
2019-11-19  5:02     ` Martin K. Petersen
2019-11-05 15:06 ` [PATCH 7/8] qla2xxx: Fix device connect issues in P2P configuration Himanshu Madhani
2019-11-05 15:21   ` Ewan D. Milne
2019-11-05 15:06 ` [PATCH 8/8] qla2xxx: Update driver version to 10.01.00.21-k Himanshu Madhani
2019-11-05 15:21   ` Ewan D. Milne
2019-11-09  2:16 ` [PATCH 0/8] qla2xxx: Bug Fixes for the driver Martin K. Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=10b38f34-128a-fd71-1542-9025dc107f62@acm.org \
    --to=bvanassche@acm.org \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=hmadhani@marvell.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).