All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] verify module parameter write_queues and poll_queues
@ 2020-04-09 15:57 Weiping Zhang
  2020-04-09 15:57 ` [PATCH 1/2] nvme: make sure write/poll_queues less or equal then cpu count Weiping Zhang
  2020-04-09 15:57 ` [PATCH 2/2] nvme: no need check write/poll_queues in nvme_init Weiping Zhang
  0 siblings, 2 replies; 5+ messages in thread
From: Weiping Zhang @ 2020-04-09 15:57 UTC (permalink / raw)
  To: hch, axboe, kbusch, sagi; +Cc: linux-nvme

Hi,

These patchset add verification for module paramter write_queues and
poll_queues to avoid trigger warnning when them were misconfigured.

Weiping Zhang (2):
  nvme: make sure write/poll_queues less or equal then cpu count
  nvme: no need check write/poll_queues in nvme_init

 drivers/nvme/host/pci.c | 24 ++++++++++++++++++++----
 1 file changed, 20 insertions(+), 4 deletions(-)

-- 
2.18.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/2] nvme: make sure write/poll_queues less or equal then cpu count
  2020-04-09 15:57 [PATCH 0/2] verify module parameter write_queues and poll_queues Weiping Zhang
@ 2020-04-09 15:57 ` Weiping Zhang
  2020-04-09 17:22   ` Keith Busch
  2020-04-09 15:57 ` [PATCH 2/2] nvme: no need check write/poll_queues in nvme_init Weiping Zhang
  1 sibling, 1 reply; 5+ messages in thread
From: Weiping Zhang @ 2020-04-09 15:57 UTC (permalink / raw)
  To: hch, axboe, kbusch, sagi; +Cc: linux-nvme

Check module parameter write/poll_queues before use, if user
change them to larger than system cpu count, it will trigger a
warning.

Reproduce:

echo $((`nproc`+1)) > /sys/module/nvme/parameters/write_queues
echo 1 > /sys/block/nvme0n1/device/reset_controller

[  657.069000] ------------[ cut here ]------------
[  657.069022] WARNING: CPU: 10 PID: 1163 at kernel/irq/affinity.c:390 irq_create_affinity_masks+0x47c/0x4a0
[  657.069023] Modules linked in: overlay xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nft_counter nf_nat_tftp nf_conntrack_tftp nft_masq nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack tun bridge nf_defrag_ipv6 nf_defrag_ipv4 stp llc ip6_tables ip_tables nft_compat rfkill ip_set nf_tables nfnetlink sunrpc intel_rapl_msr intel_rapl_common skx_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm ipmi_ssif iTCO_wdt iTCO_vendor_support irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate intel_uncore raid0 intel_rapl_perf mei_me ioatdma joydev pcspkr sg mei i2c_i801 lpc_ich dca ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter acpi_pad xfs libcrc32c sd_mod ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops nvme ahci drm i40e libahci nvme_core crc32c_intel libata t10_pi wmi d
 m_mirror
[  657.069056]  dm_region_hash dm_log dm_mod
[  657.069059] CPU: 10 PID: 1163 Comm: kworker/u193:9 Kdump: loaded Tainted: G        W         5.6.0+ #8
[  657.069060] Hardware name: Inspur SA5212M5/YZMB-00882-104, BIOS 4.0.9 08/27/2019
[  657.069064] Workqueue: nvme-reset-wq nvme_reset_work [nvme]
[  657.069066] RIP: 0010:irq_create_affinity_masks+0x47c/0x4a0
[  657.069067] Code: fe ff ff 48 c7 c0 b0 89 14 95 48 89 46 20 e9 e9 fb ff ff 31 c0 e9 90 fc ff ff 0f 0b 48 c7 44 24 08 00 00 00 00 e9 e9 fc ff ff <0f> 0b e9 87 fe ff ff 48 8b 7c 24 28 e8 33 a0 80 00 e9 b6 fc ff ff
[  657.069068] RSP: 0018:ffffb505ce1ffc78 EFLAGS: 00010202
[  657.069069] RAX: 0000000000000060 RBX: ffff9b97921fe5c0 RCX: 0000000000000000
[  657.069069] RDX: ffff9b67bad80000 RSI: 00000000ffffffa0 RDI: 0000000000000000
[  657.069070] RBP: 0000000000000000 R08: 0000000000000000 R09: ffff9b97921fe718
[  657.069070] R10: ffff9b97921fe710 R11: 0000000000000001 R12: 0000000000000064
[  657.069070] R13: 0000000000000060 R14: 0000000000000000 R15: 0000000000000001
[  657.069071] FS:  0000000000000000(0000) GS:ffff9b67c0880000(0000) knlGS:0000000000000000
[  657.069072] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  657.069072] CR2: 0000559eac6fc238 CR3: 000000057860a002 CR4: 00000000007606e0
[  657.069073] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  657.069073] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  657.069073] PKRU: 55555554
[  657.069074] Call Trace:
[  657.069080]  __pci_enable_msix_range+0x233/0x5a0
[  657.069085]  ? kernfs_put+0xec/0x190
[  657.069086]  pci_alloc_irq_vectors_affinity+0xbb/0x130
[  657.069089]  nvme_reset_work+0x6e6/0xeab [nvme]
[  657.069093]  ? __switch_to_asm+0x34/0x70
[  657.069094]  ? __switch_to_asm+0x40/0x70
[  657.069095]  ? nvme_irq_check+0x30/0x30 [nvme]
[  657.069098]  process_one_work+0x1a7/0x370
[  657.069101]  worker_thread+0x1c9/0x380
[  657.069102]  ? max_active_store+0x80/0x80
[  657.069103]  kthread+0x112/0x130
[  657.069104]  ? __kthread_parkme+0x70/0x70
[  657.069105]  ret_from_fork+0x35/0x40
[  657.069106] ---[ end trace f4f06b7d24513d06 ]---
[  657.077110] nvme nvme0: 95/1/0 default/read/poll queues

Signed-off-by: Weiping Zhang <zhangweiping@didiglobal.com>
---
 drivers/nvme/host/pci.c | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 4e79e412b276..f9e8c4441405 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -68,14 +68,20 @@ static int io_queue_depth = 1024;
 module_param_cb(io_queue_depth, &io_queue_depth_ops, &io_queue_depth, 0644);
 MODULE_PARM_DESC(io_queue_depth, "set io queue depth, should >= 2");
 
+static int io_queue_count_set(const char *val, const struct kernel_param *kp);
+static const struct kernel_param_ops io_queue_count_ops = {
+	.set = io_queue_count_set,
+	.get = param_get_uint,
+};
+
 static unsigned int write_queues;
-module_param(write_queues, uint, 0644);
+module_param_cb(write_queues, &io_queue_count_ops, &write_queues, 0644);
 MODULE_PARM_DESC(write_queues,
 	"Number of queues to use for writes. If not set, reads and writes "
 	"will share a queue set.");
 
 static unsigned int poll_queues;
-module_param(poll_queues, uint, 0644);
+module_param_cb(poll_queues, &io_queue_count_ops, &poll_queues, 0644);
 MODULE_PARM_DESC(poll_queues, "Number of queues to use for polled IO.");
 
 struct nvme_dev;
@@ -141,6 +147,18 @@ static int io_queue_depth_set(const char *val, const struct kernel_param *kp)
 	return param_set_int(val, kp);
 }
 
+static int io_queue_count_set(const char *val, const struct kernel_param *kp)
+{
+	int ret;
+	unsigned int n;
+
+	ret = kstrtouint(val, 10, &n);
+	if (ret != 0 || n > num_possible_cpus())
+		return -EINVAL;
+
+	return param_set_uint(val, kp);
+}
+
 static inline unsigned int sq_idx(unsigned int qid, u32 stride)
 {
 	return qid * 2 * stride;
-- 
2.18.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] nvme: no need check write/poll_queues in nvme_init
  2020-04-09 15:57 [PATCH 0/2] verify module parameter write_queues and poll_queues Weiping Zhang
  2020-04-09 15:57 ` [PATCH 1/2] nvme: make sure write/poll_queues less or equal then cpu count Weiping Zhang
@ 2020-04-09 15:57 ` Weiping Zhang
  1 sibling, 0 replies; 5+ messages in thread
From: Weiping Zhang @ 2020-04-09 15:57 UTC (permalink / raw)
  To: hch, axboe, kbusch, sagi; +Cc: linux-nvme

Since these two module paramters have been verified in its
callback, no need checking again.

Signed-off-by: Weiping Zhang <zhangweiping@didiglobal.com>
---
 drivers/nvme/host/pci.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index f9e8c4441405..ef9076aa21b8 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -3137,8 +3137,6 @@ static int __init nvme_init(void)
 	BUILD_BUG_ON(sizeof(struct nvme_delete_queue) != 64);
 	BUILD_BUG_ON(IRQ_AFFINITY_MAX_SETS < 2);
 
-	write_queues = min(write_queues, num_possible_cpus());
-	poll_queues = min(poll_queues, num_possible_cpus());
 	return pci_register_driver(&nvme_driver);
 }
 
-- 
2.18.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] nvme: make sure write/poll_queues less or equal then cpu count
  2020-04-09 15:57 ` [PATCH 1/2] nvme: make sure write/poll_queues less or equal then cpu count Weiping Zhang
@ 2020-04-09 17:22   ` Keith Busch
  2020-04-10  9:56     ` weiping zhang
  0 siblings, 1 reply; 5+ messages in thread
From: Keith Busch @ 2020-04-09 17:22 UTC (permalink / raw)
  To: Weiping Zhang; +Cc: hch, axboe, sagi, linux-nvme

On Thu, Apr 09, 2020 at 11:57:35PM +0800, Weiping Zhang wrote:
> Check module parameter write/poll_queues before use, if user
> change them to larger than system cpu count, it will trigger a
> warning.
> 
> Reproduce:
> 
> echo $((`nproc`+1)) > /sys/module/nvme/parameters/write_queues
> echo 1 > /sys/block/nvme0n1/device/reset_controller

I don't think it is safe to allow these parameters to be runtime writeable
with the current code: the driver allocates space for queues during probe,
so you may end up with out of bounds access if you increase the number
of queues the driver creates after that.

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] nvme: make sure write/poll_queues less or equal then cpu count
  2020-04-09 17:22   ` Keith Busch
@ 2020-04-10  9:56     ` weiping zhang
  0 siblings, 0 replies; 5+ messages in thread
From: weiping zhang @ 2020-04-10  9:56 UTC (permalink / raw)
  To: Keith Busch; +Cc: hch, axboe, linux-nvme, sagi, Weiping Zhang

On Thu, Apr 09, 2020 at 10:22:50AM -0700, Keith Busch wrote:
> On Thu, Apr 09, 2020 at 11:57:35PM +0800, Weiping Zhang wrote:
> > Check module parameter write/poll_queues before use, if user
> > change them to larger than system cpu count, it will trigger a
> > warning.
> > 
> > Reproduce:
> > 
> > echo $((`nproc`+1)) > /sys/module/nvme/parameters/write_queues
> > echo 1 > /sys/block/nvme0n1/device/reset_controller
> 
> I don't think it is safe to allow these parameters to be runtime writeable
> with the current code: the driver allocates space for queues during probe,
> so you may end up with out of bounds access if you increase the number
> of queues the driver creates after that.

Yes, if user change these parameter larger than the number allocated in
nvme_probe, then nvme_create_io_queues->nvme_alloc_queue will touch the
memory out of boundary.

I think we allow user change it dynamically, just make sure the total
io queue count will not larger than allocated.

If user really want to adjust io queue when use multiple tagset map,
they need set enough queue when load nvme moduler. Then they can
tune the queue count for each tag map.

I send a seperate patch to fix this.

Thanks

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-04-10  9:57 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-09 15:57 [PATCH 0/2] verify module parameter write_queues and poll_queues Weiping Zhang
2020-04-09 15:57 ` [PATCH 1/2] nvme: make sure write/poll_queues less or equal then cpu count Weiping Zhang
2020-04-09 17:22   ` Keith Busch
2020-04-10  9:56     ` weiping zhang
2020-04-09 15:57 ` [PATCH 2/2] nvme: no need check write/poll_queues in nvme_init Weiping Zhang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.