All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Garry <john.garry@huawei.com>
To: Sumit Saxena <sumit.saxena@broadcom.com>
Cc: Qian Cai <cai@redhat.com>,
	Kashyap Desai <kashyap.desai@broadcom.com>,
	Jens Axboe <axboe@kernel.dk>,
	"James E.J. Bottomley" <jejb@linux.ibm.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	"don.brace@microsemi.com" <don.brace@microsemi.com>,
	Ming Lei <ming.lei@redhat.com>,
	Bart Van Assche <bvanassche@acm.org>,
	"dgilbert@interlog.com" <dgilbert@interlog.com>,
	"paolo.valente@linaro.org" <paolo.valente@linaro.org>,
	Hannes Reinecke <hare@suse.de>, Christoph Hellwig <hch@lst.de>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	"Linux SCSI List" <linux-scsi@vger.kernel.org>,
	"esc.storagedev@microsemi.com" <esc.storagedev@microsemi.com>,
	"PDL,MEGARAIDLINUX" <megaraidlinux.pdl@broadcom.com>,
	"chenxiang (M)" <chenxiang66@hisilicon.com>,
	luojiaxing <luojiaxing@huawei.com>,
	"Hannes Reinecke" <hare@suse.com>
Subject: Re: [PATCH v8 17/18] scsi: megaraid_sas: Added support for shared host tagset for cpuhotplug
Date: Wed, 11 Nov 2020 11:51:25 +0000	[thread overview]
Message-ID: <e4d2031e-63f7-4c9a-daf5-4cb2b1ff3052@huawei.com> (raw)
In-Reply-To: <CAL2rwxpQt-w2Re8ttu0=6Yzb7ibX3_FB6j-kd_cbtrWxzc7chw@mail.gmail.com>

> 
> In Qian's kernel .config, async scsi scan is disabled so in failure
> case SCSI scan type is synchronous.
> Below is the stack trace when scsi_scan_host() hangs:
> 
> [<0>] __wait_rcu_gp+0x134/0x170
> [<0>] synchronize_rcu.part.80+0x53/0x60
> [<0>] blk_free_flush_queue+0x12/0x30
> [<0>] blk_mq_hw_sysfs_release+0x21/0x70

this is per blk_mq_hw_ctx

> [<0>] kobject_release+0x46/0x150
> [<0>] blk_mq_release+0xb4/0xf0
> [<0>] blk_release_queue+0xc4/0x130
> [<0>] kobject_release+0x46/0x150
> [<0>] scsi_device_dev_release_usercontext+0x194/0x3f0
> [<0>] execute_in_process_context+0x22/0xa0
> [<0>] device_release+0x2e/0x80
> [<0>] kobject_release+0x46/0x150
> [<0>] scsi_alloc_sdev+0x2e7/0x310
> [<0>] scsi_probe_and_add_lun+0x410/0xbd0
> [<0>] __scsi_scan_target+0xf2/0x530
> [<0>] scsi_scan_channel.part.7+0x51/0x70
> [<0>] scsi_scan_host_selected+0xd4/0x140
> [<0>] scsi_scan_host+0x198/0x1c0
> 
> This issue hits when lock related debugging is enabled in kernel config.
> kernel .config parameters(may be subset of this list) are required to
> hit the issue:
> 
> CONFIG_PREEMPT_COUNT=y *
> CONFIG_UNINLINE_SPIN_UNLOCK=y *
> CONFIG_LOCK_STAT=y
> CONFIG_DEBUG_RT_MUTEXES=y *
> CONFIG_DEBUG_SPINLOCK=y *
> CONFIG_DEBUG_MUTEXES=y *
> CONFIG_DEBUG_WW_MUTEX_SLOWPATH=y *
> CONFIG_DEBUG_RWSEMS=y *
> CONFIG_DEBUG_LOCK_ALLOC=y *
> CONFIG_LOCKDEP=y *
> CONFIG_DEBUG_LOCKDEP=y
> CONFIG_TRACE_IRQFLAGS=y *
> CONFIG_TRACE_IRQFLAGS_NMI=y
> CONFIG_DEBUG_KOBJECT=y 
> CONFIG_PROVE_RCU=y *
> CONFIG_PREEMPTIRQ_TRACEPOINTS=y *

(* means that I enabled)

> 
> When scsi_scan_host() hangs, there are no outstanding IOs with
> megaraid_sas driver-firmware stack as SCSI "host_busy" counter and
> megaraid_sas driver's internal counter are "0".
> Key takeaways:
> 1. Issue is observed when lock related debugging is enabled so issue
> is seen in debug environment.
> 2. Issue seems to be related to generic shared "host_tagset" code
> whenever some kind of kernel debugging is enabled. We do not see an
> immediate reason to hide this issue through disabling the
> "host_tagset" feature.
> 
> John,
> Issue may hit on ARM platform too using Qian's .config file with other
> adapters (e.g. hisi_sas) as well. So I feel disabling “host_tagset” in
> megaraid_sas driver will not help.  It requires debugging from the
> “Entire Shared host tag feature” perspective as scsi_scan_host()
> waittime aggravates when "host_tagset" is enabled. Also, I am doing
> parallel debugging and if I find anything useful, I will share.

So isn't this then really related to how many HW queues we expose there 
is just scaling up the time? For megaraid sas, it's 1->128 for my arm64 
platform when host_tagset_enable=1.

As a hack, I tried this (while keeping host_tagset_enable=1):

@@ -6162,11 +6168,15 @@ static int megasas_init_fw(struct 
megasas_instance *instance)
                else
                        instance->low_latency_index_start = 1;

-               num_msix_req = num_online_cpus() + 
instance->low_latency_index_start;
+               num_msix_req = 6 + instance->low_latency_index_start;

(6 is an arbitrary small number)

And boot time is nearly same as with host_tagset_enable=0.

For hisi_sas, max HW queue number ever is 16. In addition, we don't scan 
each channel/id/lun for hisi_sas, as it has a scan handler.

> 
> Qian,
> I need full dmesg logs from your setup with
> megaraid_sas.host_tagset_enable=1 and
> megaraid_sas.host_tagset_enable=0. Please wait for a long time. I just
> want to make sure that whatever you observe is the same as mine.
> 

Thanks,
John


  parent reply	other threads:[~2020-11-11 11:51 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-19 15:20 [PATCH v8 00/18] blk-mq/scsi: Provide hostwide shared tags for SCSI HBAs John Garry
2020-08-19 15:20 ` [PATCH v8 01/18] blk-mq: Rename BLK_MQ_F_TAG_SHARED as BLK_MQ_F_TAG_QUEUE_SHARED John Garry
2020-08-19 15:20 ` [PATCH v8 02/18] blk-mq: Rename blk_mq_update_tag_set_depth() John Garry
2020-08-19 15:20 ` [PATCH v8 03/18] blk-mq: Free tags in blk_mq_init_tags() upon error John Garry
2020-08-19 15:20 ` [PATCH v8 04/18] blk-mq: Pass flags for tag init/free John Garry
2020-08-19 15:20 ` [PATCH v8 05/18] blk-mq: Use pointers for blk_mq_tags bitmap tags John Garry
2020-08-19 15:20 ` [PATCH v8 06/18] blk-mq: Facilitate a shared sbitmap per tagset John Garry
2020-08-19 15:20 ` [PATCH v8 07/18] blk-mq: Relocate hctx_may_queue() John Garry
2020-08-19 15:20 ` [PATCH v8 08/18] blk-mq: Record nr_active_requests per queue for when using shared sbitmap John Garry
2020-08-19 15:20 ` [PATCH v8 09/18] blk-mq: Record active_queues_shared_sbitmap per tag_set " John Garry
2020-08-19 15:20 ` [PATCH v8 10/18] blk-mq, elevator: Count requests per hctx to improve performance John Garry
2020-08-19 15:20 ` [PATCH v8 11/18] null_blk: Support shared tag bitmap John Garry
2020-08-19 15:20 ` [PATCH v8 12/18] scsi: Add host and host template flag 'host_tagset' John Garry
2020-08-19 15:20 ` [PATCH v8 13/18] scsi: core: Show nr_hw_queues in sysfs John Garry
2020-09-10  8:33   ` John Garry
2020-08-19 15:20 ` [PATCH v8 14/18] scsi: hisi_sas: Switch v3 hw to MQ John Garry
2020-08-19 15:20 ` [PATCH v8 15/18] scsi: scsi_debug: Support host tagset John Garry
2020-08-19 15:20 ` [PATCH v8 16/18] hpsa: enable host_tagset and switch to MQ John Garry
2020-08-19 15:20 ` [PATCH v8 17/18] scsi: megaraid_sas: Added support for shared host tagset for cpuhotplug John Garry
2020-11-02 14:17   ` Qian Cai
2020-11-02 14:31     ` Kashyap Desai
2020-11-02 15:24       ` Qian Cai
2020-11-02 14:51     ` John Garry
2020-11-02 15:18       ` Qian Cai
2020-11-03 10:54         ` John Garry
2020-11-03 13:04           ` Qian Cai
2020-11-04 15:21             ` Qian Cai
2020-11-04 16:07               ` Kashyap Desai
2020-11-04 18:08                 ` John Garry
2020-11-06 19:25                   ` Sumit Saxena
2020-11-07  0:17                     ` Qian Cai
2020-11-09  8:49                       ` John Garry
2020-11-09 13:39                         ` Qian Cai
2020-11-09 14:05                           ` John Garry
2020-11-10 17:42                             ` John Garry
2020-11-11  7:27                               ` Sumit Saxena
2020-11-11  9:27                                 ` Ming Lei
2020-11-11 11:36                                   ` Sumit Saxena
2020-11-11 14:42                                   ` Qian Cai
2020-11-11 15:04                                     ` Ming Lei
2020-11-11 11:51                                 ` John Garry [this message]
2020-08-19 15:20 ` [PATCH v8 18/18] smartpqi: enable host tagset John Garry
2020-08-27  8:53 ` [PATCH v8 00/18] blk-mq/scsi: Provide hostwide shared tags for SCSI HBAs John Garry
2020-09-03 19:28 ` Douglas Gilbert
2020-09-03 21:23 ` Jens Axboe
2020-09-04  9:09   ` John Garry
2020-09-04 12:44     ` Martin K. Petersen
2020-09-16  7:21       ` John Garry
2020-09-17  1:10         ` Martin K. Petersen
2020-09-17  6:48           ` John Garry
2020-09-21 21:35             ` Don.Brace
2020-09-21 22:15               ` John Garry
2020-09-22  9:03                 ` John Garry
2020-09-28 16:11           ` Kashyap Desai
2020-10-06 14:24             ` John Garry
2020-10-06 14:42               ` Jens Axboe
2020-09-08 12:46 ` Hannes Reinecke
2020-09-08 13:38   ` John Garry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e4d2031e-63f7-4c9a-daf5-4cb2b1ff3052@huawei.com \
    --to=john.garry@huawei.com \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=cai@redhat.com \
    --cc=chenxiang66@hisilicon.com \
    --cc=dgilbert@interlog.com \
    --cc=don.brace@microsemi.com \
    --cc=esc.storagedev@microsemi.com \
    --cc=hare@suse.com \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=jejb@linux.ibm.com \
    --cc=kashyap.desai@broadcom.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=luojiaxing@huawei.com \
    --cc=martin.petersen@oracle.com \
    --cc=megaraidlinux.pdl@broadcom.com \
    --cc=ming.lei@redhat.com \
    --cc=paolo.valente@linaro.org \
    --cc=sumit.saxena@broadcom.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.