From: Israel Rukshin <israelr@mellanox.com> To: Linux-nvme <linux-nvme@lists.infradead.org>, Sagi Grimberg <sagi@grimberg.me>, Christoph Hellwig <hch@lst.de>, James Smart <jsmart2021@gmail.com>, Keith Busch <kbusch@kernel.org> Cc: Israel Rukshin <israelr@mellanox.com>, Max Gurtovoy <maxg@mellanox.com> Subject: [PATCH 3/3] nvmet-loop: Avoid preallocating big SGL for data Date: Sun, 24 Nov 2019 18:38:32 +0200 [thread overview] Message-ID: <1574613512-5943-4-git-send-email-israelr@mellanox.com> (raw) In-Reply-To: <1574613512-5943-1-git-send-email-israelr@mellanox.com> nvme_loop_create_io_queues() preallocates a big buffer for the IO SGL based on SG_CHUNK_SIZE. Modern DMA engines are often capable of dealing with very big segments so the SG_CHUNK_SIZE is often too big. SG_CHUNK_SIZE results in a static 4KB SGL allocation per command. If a controller has lots of deep queues, preallocation for the sg list can consume substantial amounts of memory. For nvmet-loop, nr_hw_queues can be 128 and each queue's depth 128. This means the resulting preallocation for the data SGL is 128*128*4K = 64MB per controller. Switch to runtime allocation for SGL for lists longer than 2 entries. This is the approach used by NVMe PCI so it should be reasonable for NVMeOF as well. Runtime SGL allocation has always been the case for the legacy I/O path so this is nothing new. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> --- drivers/nvme/target/loop.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/nvme/target/loop.c b/drivers/nvme/target/loop.c index 856eb06..dae31bf 100644 --- a/drivers/nvme/target/loop.c +++ b/drivers/nvme/target/loop.c @@ -76,7 +76,7 @@ static void nvme_loop_complete_rq(struct request *req) { struct nvme_loop_iod *iod = blk_mq_rq_to_pdu(req); - sg_free_table_chained(&iod->sg_table, SG_CHUNK_SIZE); + sg_free_table_chained(&iod->sg_table, NVME_INLINE_SG_CNT); nvme_complete_rq(req); } @@ -156,7 +156,7 @@ static blk_status_t nvme_loop_queue_rq(struct blk_mq_hw_ctx *hctx, iod->sg_table.sgl = iod->first_sgl; if (sg_alloc_table_chained(&iod->sg_table, blk_rq_nr_phys_segments(req), - iod->sg_table.sgl, SG_CHUNK_SIZE)) + iod->sg_table.sgl, NVME_INLINE_SG_CNT)) return BLK_STS_RESOURCE; iod->req.sg = iod->sg_table.sgl; @@ -340,7 +340,7 @@ static int nvme_loop_configure_admin_queue(struct nvme_loop_ctrl *ctrl) ctrl->admin_tag_set.reserved_tags = 2; /* connect + keep-alive */ ctrl->admin_tag_set.numa_node = NUMA_NO_NODE; ctrl->admin_tag_set.cmd_size = sizeof(struct nvme_loop_iod) + - SG_CHUNK_SIZE * sizeof(struct scatterlist); + NVME_INLINE_SG_CNT * sizeof(struct scatterlist); ctrl->admin_tag_set.driver_data = ctrl; ctrl->admin_tag_set.nr_hw_queues = 1; ctrl->admin_tag_set.timeout = ADMIN_TIMEOUT; @@ -514,7 +514,7 @@ static int nvme_loop_create_io_queues(struct nvme_loop_ctrl *ctrl) ctrl->tag_set.numa_node = NUMA_NO_NODE; ctrl->tag_set.flags = BLK_MQ_F_SHOULD_MERGE; ctrl->tag_set.cmd_size = sizeof(struct nvme_loop_iod) + - SG_CHUNK_SIZE * sizeof(struct scatterlist); + NVME_INLINE_SG_CNT * sizeof(struct scatterlist); ctrl->tag_set.driver_data = ctrl; ctrl->tag_set.nr_hw_queues = ctrl->ctrl.queue_count - 1; ctrl->tag_set.timeout = NVME_IO_TIMEOUT; -- 1.8.3.1 _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme
next prev parent reply other threads:[~2019-11-24 16:39 UTC|newest] Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-11-24 16:38 [PATCH 0/3] nvme: " Israel Rukshin 2019-11-24 16:38 ` [PATCH 1/3] nvme-rdma: " Israel Rukshin 2019-11-26 16:53 ` Christoph Hellwig 2019-11-24 16:38 ` [PATCH 2/3] nvme-fc: " Israel Rukshin 2019-11-25 17:04 ` James Smart 2019-11-24 16:38 ` Israel Rukshin [this message] 2019-11-25 2:24 ` [PATCH 3/3] nvmet-loop: " Chaitanya Kulkarni 2019-11-26 16:53 ` Christoph Hellwig 2019-11-26 17:40 ` Keith Busch
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1574613512-5943-4-git-send-email-israelr@mellanox.com \ --to=israelr@mellanox.com \ --cc=hch@lst.de \ --cc=jsmart2021@gmail.com \ --cc=kbusch@kernel.org \ --cc=linux-nvme@lists.infradead.org \ --cc=maxg@mellanox.com \ --cc=sagi@grimberg.me \ --subject='Re: [PATCH 3/3] nvmet-loop: Avoid preallocating big SGL for data' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).