stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: James Smart <jsmart2021@gmail.com>,
	Dick Kennedy <dick.kennedy@broadcom.com>,
	Ming Lei <ming.lei@redhat.com>,
	"Ewan D . Milne" <emilne@redhat.com>,
	"Martin K . Petersen" <martin.petersen@oracle.com>,
	Sasha Levin <sashal@kernel.org>,
	linux-scsi@vger.kernel.org
Subject: [PATCH AUTOSEL 5.2 52/76] scsi: lpfc: Mitigate high memory pre-allocation by SCSI-MQ
Date: Thu, 29 Aug 2019 14:12:47 -0400	[thread overview]
Message-ID: <20190829181311.7562-52-sashal@kernel.org> (raw)
In-Reply-To: <20190829181311.7562-1-sashal@kernel.org>

From: James Smart <jsmart2021@gmail.com>

[ Upstream commit 77ffd3465ba837e9dc714e17b014e77b2eae765a ]

When SCSI-MQ is enabled, the SCSI-MQ layers will do pre-allocation of MQ
resources based on shost values set by the driver. In newer cases of the
driver, which attempts to set nr_hw_queues to the cpu count, the
multipliers become excessive, with a single shost having SCSI-MQ
pre-allocation reaching into the multiple GBytes range.  NPIV, which
creates additional shosts, only multiply this overhead. On lower-memory
systems, this can exhaust system memory very quickly, resulting in a system
crash or failures in the driver or elsewhere due to low memory conditions.

After testing several scenarios, the situation can be mitigated by limiting
the value set in shost->nr_hw_queues to 4. Although the shost values were
changed, the driver still had per-cpu hardware queues of its own that
allowed parallelization per-cpu.  Testing revealed that even with the
smallish number for nr_hw_queues for SCSI-MQ, performance levels remained
near maximum with the within-driver affiinitization.

A module parameter was created to allow the value set for the nr_hw_queues
to be tunable.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/scsi/lpfc/lpfc.h      |  1 +
 drivers/scsi/lpfc/lpfc_attr.c | 15 +++++++++++++++
 drivers/scsi/lpfc/lpfc_init.c | 10 ++++++----
 drivers/scsi/lpfc/lpfc_sli4.h |  5 +++++
 4 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h
index aafcffaa25f71..4604e1bc334c0 100644
--- a/drivers/scsi/lpfc/lpfc.h
+++ b/drivers/scsi/lpfc/lpfc.h
@@ -822,6 +822,7 @@ struct lpfc_hba {
 	uint32_t cfg_cq_poll_threshold;
 	uint32_t cfg_cq_max_proc_limit;
 	uint32_t cfg_fcp_cpu_map;
+	uint32_t cfg_fcp_mq_threshold;
 	uint32_t cfg_hdw_queue;
 	uint32_t cfg_irq_chann;
 	uint32_t cfg_suppress_rsp;
diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c
index d4c65e2109e2f..353da12d797ba 100644
--- a/drivers/scsi/lpfc/lpfc_attr.c
+++ b/drivers/scsi/lpfc/lpfc_attr.c
@@ -5640,6 +5640,19 @@ LPFC_ATTR_RW(nvme_oas, 0, 0, 1,
 LPFC_ATTR_RW(nvme_embed_cmd, 1, 0, 2,
 	     "Embed NVME Command in WQE");
 
+/*
+ * lpfc_fcp_mq_threshold: Set the maximum number of Hardware Queues
+ * the driver will advertise it supports to the SCSI layer.
+ *
+ *      0    = Set nr_hw_queues by the number of CPUs or HW queues.
+ *      1,128 = Manually specify the maximum nr_hw_queue value to be set,
+ *
+ * Value range is [0,128]. Default value is 8.
+ */
+LPFC_ATTR_R(fcp_mq_threshold, LPFC_FCP_MQ_THRESHOLD_DEF,
+	    LPFC_FCP_MQ_THRESHOLD_MIN, LPFC_FCP_MQ_THRESHOLD_MAX,
+	    "Set the number of SCSI Queues advertised");
+
 /*
  * lpfc_hdw_queue: Set the number of Hardware Queues the driver
  * will advertise it supports to the NVME and  SCSI layers. This also
@@ -5961,6 +5974,7 @@ struct device_attribute *lpfc_hba_attrs[] = {
 	&dev_attr_lpfc_cq_poll_threshold,
 	&dev_attr_lpfc_cq_max_proc_limit,
 	&dev_attr_lpfc_fcp_cpu_map,
+	&dev_attr_lpfc_fcp_mq_threshold,
 	&dev_attr_lpfc_hdw_queue,
 	&dev_attr_lpfc_irq_chann,
 	&dev_attr_lpfc_suppress_rsp,
@@ -7042,6 +7056,7 @@ lpfc_get_cfgparam(struct lpfc_hba *phba)
 	/* Initialize first burst. Target vs Initiator are different. */
 	lpfc_nvme_enable_fb_init(phba, lpfc_nvme_enable_fb);
 	lpfc_nvmet_fb_size_init(phba, lpfc_nvmet_fb_size);
+	lpfc_fcp_mq_threshold_init(phba, lpfc_fcp_mq_threshold);
 	lpfc_hdw_queue_init(phba, lpfc_hdw_queue);
 	lpfc_irq_chann_init(phba, lpfc_irq_chann);
 	lpfc_enable_bbcr_init(phba, lpfc_enable_bbcr);
diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index eaaef682de251..2fd8f15f99975 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -4308,10 +4308,12 @@ lpfc_create_port(struct lpfc_hba *phba, int instance, struct device *dev)
 	shost->max_cmd_len = 16;
 
 	if (phba->sli_rev == LPFC_SLI_REV4) {
-		if (phba->cfg_fcp_io_sched == LPFC_FCP_SCHED_BY_HDWQ)
-			shost->nr_hw_queues = phba->cfg_hdw_queue;
-		else
-			shost->nr_hw_queues = phba->sli4_hba.num_present_cpu;
+		if (!phba->cfg_fcp_mq_threshold ||
+		    phba->cfg_fcp_mq_threshold > phba->cfg_hdw_queue)
+			phba->cfg_fcp_mq_threshold = phba->cfg_hdw_queue;
+
+		shost->nr_hw_queues = min_t(int, 2 * num_possible_nodes(),
+					    phba->cfg_fcp_mq_threshold);
 
 		shost->dma_boundary =
 			phba->sli4_hba.pc_sli4_params.sge_supp_len-1;
diff --git a/drivers/scsi/lpfc/lpfc_sli4.h b/drivers/scsi/lpfc/lpfc_sli4.h
index 8e4fd1a98023c..986594ec40e2a 100644
--- a/drivers/scsi/lpfc/lpfc_sli4.h
+++ b/drivers/scsi/lpfc/lpfc_sli4.h
@@ -44,6 +44,11 @@
 #define LPFC_HBA_HDWQ_MAX	128
 #define LPFC_HBA_HDWQ_DEF	0
 
+/* FCP MQ queue count limiting */
+#define LPFC_FCP_MQ_THRESHOLD_MIN	0
+#define LPFC_FCP_MQ_THRESHOLD_MAX	128
+#define LPFC_FCP_MQ_THRESHOLD_DEF	8
+
 /* Common buffer size to accomidate SCSI and NVME IO buffers */
 #define LPFC_COMMON_IO_BUF_SZ	768
 
-- 
2.20.1


  parent reply	other threads:[~2019-08-29 18:29 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-29 18:11 [PATCH AUTOSEL 5.2 01/76] batman-adv: Fix netlink dumping of all mcast_flags buckets Sasha Levin
2019-08-29 18:11 ` [PATCH AUTOSEL 5.2 02/76] libbpf: fix erroneous multi-closing of BTF FD Sasha Levin
2019-08-29 18:11 ` [PATCH AUTOSEL 5.2 03/76] libbpf: set BTF FD for prog only when there is supported .BTF.ext data Sasha Levin
2019-08-29 18:11 ` [PATCH AUTOSEL 5.2 04/76] netfilter: nf_flow_table: fix offload for flows that are subject to xfrm Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 05/76] net/mlx5e: Fix error flow of CQE recovery on tx reporter Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 06/76] clk: samsung: Change signature of exynos5_subcmus_init() function Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 07/76] clk: samsung: exynos5800: Move MAU subsystem clocks to MAU sub-CMU Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 08/76] clk: samsung: exynos542x: Move MSCL subsystem clocks to its sub-CMU Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 09/76] net: tundra: tsi108: use spin_lock_irqsave instead of spin_lock_irq in IRQ context Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 10/76] netfilter: nf_tables: use-after-free in failing rule with bound set Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 11/76] netfilter: nf_flow_table: conntrack picks up expired flows Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 12/76] netfilter: nf_flow_table: teardown flow timeout race Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 13/76] rxrpc: Fix local endpoint refcounting Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 14/76] tools: bpftool: fix error message (prog -> object) Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 15/76] ixgbe: fix possible deadlock in ixgbe_service_task() Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 16/76] hv_netvsc: Fix a warning of suspicious RCU usage Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 17/76] net: tc35815: Explicitly check NET_IP_ALIGN is not zero in tc35815_rx Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 18/76] Bluetooth: btqca: Add a short delay before downloading the NVM Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 19/76] Bluetooth: hci_qca: Send VS pre shutdown command Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 20/76] Bluetooth: hidp: Let hidp_send_message return number of queued bytes Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 21/76] s390/qeth: serialize cmd reply with concurrent timeout Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 22/76] ibmveth: Convert multicast list size for little-endian system Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 23/76] gpio: Fix build error of function redefinition Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 24/76] netfilter: nft_flow_offload: skip tcp rst and fin packets Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 25/76] rxrpc: Fix local endpoint replacement Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 26/76] rxrpc: Fix read-after-free in rxrpc_queue_local() Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 27/76] drm/mediatek: use correct device to import PRIME buffers Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 28/76] drm/mediatek: set DMA max segment size Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 29/76] scsi: qla2xxx: Fix gnl.l memory leak on adapter init failure Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 30/76] scsi: target: tcmu: avoid use-after-free after command timeout Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 31/76] cxgb4: fix a memory leak bug Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 32/76] selftests: kvm: do not try running the VM in vmx_set_nested_state_test Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 33/76] selftests: kvm: provide common function to enable eVMCS Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 34/76] selftests: kvm: fix vmx_set_nested_state_test Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 35/76] liquidio: add cleanup in octeon_setup_iq() Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 36/76] net: myri10ge: fix memory leaks Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 37/76] clk: Fix falling back to legacy parent string matching Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 38/76] clk: Fix potential NULL dereference in clk_fetch_parent_index() Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 39/76] lan78xx: Fix memory leaks Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 40/76] vfs: fix page locking deadlocks when deduping files Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 41/76] cx82310_eth: fix a memory leak bug Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 42/76] net: kalmia: fix memory leaks Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 43/76] ibmvnic: Unmap DMA address of TX descriptor buffers after use Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 44/76] net: cavium: fix driver name Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 45/76] wimax/i2400m: fix a memory leak bug Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 46/76] ravb: Fix use-after-free ravb_tstamp_skb Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 47/76] sched/core: Schedule new worker even if PI-blocked Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 48/76] kprobes: Fix potential deadlock in kprobe_optimizer() Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 49/76] HID: intel-ish-hid: ipc: add EHL device id Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 50/76] HID: cp2112: prevent sleeping function called from invalid context Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 51/76] x86/boot/compressed/64: Fix boot on machines with broken E820 table Sasha Levin
2019-08-29 22:17   ` Kirill A. Shutemov
2019-08-30 12:06     ` Sasha Levin
2019-08-30 13:25       ` Kirill A. Shutemov
2019-08-29 18:12 ` Sasha Levin [this message]
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 53/76] Input: hyperv-keyboard: Use in-place iterator API in the channel callback Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 54/76] Tools: hv: kvp: eliminate 'may be used uninitialized' warning Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 55/76] io_uring: fix potential hang with polled IO Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 56/76] nvme-multipath: fix possible I/O hang when paths are updated Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 57/76] nvme: Fix cntlid validation when not using NVMEoF Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 58/76] io_uring: don't enter poll loop if we have CQEs pending Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 59/76] RDMA/cma: fix null-ptr-deref Read in cma_cleanup Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 60/76] IB/mlx4: Fix memory leaks Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 61/76] infiniband: hfi1: fix a memory leak bug Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 62/76] infiniband: hfi1: fix memory leaks Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 63/76] selftests: kvm: fix state save/load on processors without XSAVE Sasha Levin
2019-08-29 18:12 ` [PATCH AUTOSEL 5.2 64/76] selftests/kvm: make platform_info_test pass on AMD Sasha Levin
2019-08-29 18:13 ` [PATCH AUTOSEL 5.2 65/76] drm/amdgpu: prevent memory leaks in AMDGPU_CS ioctl Sasha Levin
2019-08-29 18:13 ` [PATCH AUTOSEL 5.2 66/76] ceph: fix buffer free while holding i_ceph_lock in __ceph_setxattr() Sasha Levin
2019-08-29 20:51   ` Ilya Dryomov
2019-08-29 21:16     ` Sasha Levin
2019-08-30  8:31       ` Ilya Dryomov
2019-08-29 18:13 ` [PATCH AUTOSEL 5.2 67/76] ceph: fix buffer free while holding i_ceph_lock in __ceph_build_xattrs_blob() Sasha Levin
2019-08-29 18:13 ` [PATCH AUTOSEL 5.2 68/76] ceph: fix buffer free while holding i_ceph_lock in fill_inode() Sasha Levin
2019-08-29 18:13 ` [PATCH AUTOSEL 5.2 69/76] KVM: arm/arm64: Only skip MMIO insn once Sasha Levin
2019-08-29 18:13 ` [PATCH AUTOSEL 5.2 70/76] afs: Fix leak in afs_lookup_cell_rcu() Sasha Levin
2019-08-29 18:13 ` [PATCH AUTOSEL 5.2 71/76] afs: Fix possible oops in afs_lookup trace event Sasha Levin
2019-08-29 18:13 ` [PATCH AUTOSEL 5.2 72/76] afs: use correct afs_call_type in yfs_fs_store_opaque_acl2 Sasha Levin
2019-08-29 18:13 ` [PATCH AUTOSEL 5.2 73/76] RDMA/bnxt_re: Fix stack-out-of-bounds in bnxt_qplib_rcfw_send_message Sasha Levin
2019-08-29 18:13 ` [PATCH AUTOSEL 5.2 74/76] io_uring: add need_resched() check in inner poll loop Sasha Levin
2019-08-29 18:13 ` [PATCH AUTOSEL 5.2 75/76] gpio: Fix irqchip initialization order Sasha Levin
2019-08-29 18:13 ` [PATCH AUTOSEL 5.2 76/76] KVM: arm/arm64: VGIC: Properly initialise private IRQ affinity Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190829181311.7562-52-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=dick.kennedy@broadcom.com \
    --cc=emilne@redhat.com \
    --cc=jsmart2021@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=ming.lei@redhat.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).