* [PATCHSET v2] nvme: don't do full memset() for command setup
@ 2021-10-18 12:49 Jens Axboe
2021-10-18 12:49 ` [PATCH 1/2] nvme: move command clear into the various setup helpers Jens Axboe
2021-10-18 12:49 ` [PATCH 2/2] nvme: don't memset() the normal read/write command Jens Axboe
0 siblings, 2 replies; 6+ messages in thread
From: Jens Axboe @ 2021-10-18 12:49 UTC (permalink / raw)
To: linux-block; +Cc: hch
Hi,
Respun this one, splitting it into two pieces.
--
Jens Axboe
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/2] nvme: move command clear into the various setup helpers
2021-10-18 12:49 [PATCHSET v2] nvme: don't do full memset() for command setup Jens Axboe
@ 2021-10-18 12:49 ` Jens Axboe
2021-10-18 12:53 ` [PATCH v2 " Jens Axboe
2021-10-18 12:49 ` [PATCH 2/2] nvme: don't memset() the normal read/write command Jens Axboe
1 sibling, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2021-10-18 12:49 UTC (permalink / raw)
To: linux-block; +Cc: hch, Jens Axboe
We don't have to worry about doing extra memsets by moving it outside
the protection of RQF_DONTPREP, as nvme doesn't do partial completions.
This is in preparation for making the read/write fast path not do a full
memset of the command.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
drivers/nvme/host/core.c | 11 ++++++++---
drivers/nvme/host/zns.c | 2 ++
2 files changed, 10 insertions(+), 3 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index ae15cb714596..7944ad52f213 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -874,6 +874,7 @@ static blk_status_t nvme_setup_discard(struct nvme_ns *ns, struct request *req,
return BLK_STS_IOERR;
}
+ memset(cmnd, 0, sizeof(*cmnd));
cmnd->dsm.opcode = nvme_cmd_dsm;
cmnd->dsm.nsid = cpu_to_le32(ns->head->ns_id);
cmnd->dsm.nr = cpu_to_le32(segments - 1);
@@ -890,6 +891,8 @@ static blk_status_t nvme_setup_discard(struct nvme_ns *ns, struct request *req,
static inline blk_status_t nvme_setup_write_zeroes(struct nvme_ns *ns,
struct request *req, struct nvme_command *cmnd)
{
+ memset(cmnd, 0, sizeof(*cmnd));
+
if (ns->ctrl->quirks & NVME_QUIRK_DEALLOCATE_ZEROES)
return nvme_setup_discard(ns, req, cmnd);
@@ -914,6 +917,8 @@ static inline blk_status_t nvme_setup_rw(struct nvme_ns *ns,
u16 control = 0;
u32 dsmgmt = 0;
+ memset(cmnd, 0, sizeof(*cmnd));
+
if (req->cmd_flags & REQ_FUA)
control |= NVME_RW_FUA;
if (req->cmd_flags & (REQ_FAILFAST_DEV | REQ_RAHEAD))
@@ -982,17 +987,17 @@ blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req)
struct nvme_ctrl *ctrl = nvme_req(req)->ctrl;
blk_status_t ret = BLK_STS_OK;
- if (!(req->rq_flags & RQF_DONTPREP)) {
+ if (!(req->rq_flags & RQF_DONTPREP))
nvme_clear_nvme_request(req);
- memset(cmd, 0, sizeof(*cmd));
- }
switch (req_op(req)) {
case REQ_OP_DRV_IN:
case REQ_OP_DRV_OUT:
/* these are setup prior to execution in nvme_init_request() */
+ memset(cmd, 0, sizeof(*cmd));
break;
case REQ_OP_FLUSH:
+ memset(cmd, 0, sizeof(*cmd));
nvme_setup_flush(ns, cmd);
break;
case REQ_OP_ZONE_RESET_ALL:
diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c
index d95010481fce..bfc259e0d7b8 100644
--- a/drivers/nvme/host/zns.c
+++ b/drivers/nvme/host/zns.c
@@ -233,6 +233,8 @@ int nvme_ns_report_zones(struct nvme_ns *ns, sector_t sector,
blk_status_t nvme_setup_zone_mgmt_send(struct nvme_ns *ns, struct request *req,
struct nvme_command *c, enum nvme_zone_mgmt_action action)
{
+ memset(c, 0, sizeof(*c));
+
c->zms.opcode = nvme_cmd_zone_mgmt_send;
c->zms.nsid = cpu_to_le32(ns->head->ns_id);
c->zms.slba = cpu_to_le64(nvme_sect_to_lba(ns, blk_rq_pos(req)));
--
2.33.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/2] nvme: don't memset() the normal read/write command
2021-10-18 12:49 [PATCHSET v2] nvme: don't do full memset() for command setup Jens Axboe
2021-10-18 12:49 ` [PATCH 1/2] nvme: move command clear into the various setup helpers Jens Axboe
@ 2021-10-18 12:49 ` Jens Axboe
2021-10-19 18:29 ` Keith Busch
1 sibling, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2021-10-18 12:49 UTC (permalink / raw)
To: linux-block; +Cc: hch, Jens Axboe, Keith Busch
This memset in the fast path costs a lot of cycles on my setup. Here's a
top-of-profile of doing ~6.7M IOPS:
+ 5.90% io_uring [nvme] [k] nvme_queue_rq
+ 5.32% io_uring [nvme_core] [k] nvme_setup_cmd
+ 5.17% io_uring [kernel.vmlinux] [k] io_submit_sqes
+ 4.97% io_uring [kernel.vmlinux] [k] blkdev_direct_IO
and a perf diff with this patch:
0.92% +4.40% [nvme_core] [k] nvme_setup_cmd
reducing it from 5.3% to only 0.9%. This takes it from the 2nd most
cycle consumer to something that's mostly irrelevant.
Acked-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
drivers/nvme/host/core.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 7944ad52f213..3e691354598c 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -917,8 +917,6 @@ static inline blk_status_t nvme_setup_rw(struct nvme_ns *ns,
u16 control = 0;
u32 dsmgmt = 0;
- memset(cmnd, 0, sizeof(*cmnd));
-
if (req->cmd_flags & REQ_FUA)
control |= NVME_RW_FUA;
if (req->cmd_flags & (REQ_FAILFAST_DEV | REQ_RAHEAD))
@@ -928,9 +926,15 @@ static inline blk_status_t nvme_setup_rw(struct nvme_ns *ns,
dsmgmt |= NVME_RW_DSM_FREQ_PREFETCH;
cmnd->rw.opcode = op;
+ cmnd->rw.flags = 0;
cmnd->rw.nsid = cpu_to_le32(ns->head->ns_id);
+ cmnd->rw.rsvd2 = 0;
+ cmnd->rw.metadata = 0;
cmnd->rw.slba = cpu_to_le64(nvme_sect_to_lba(ns, blk_rq_pos(req)));
cmnd->rw.length = cpu_to_le16((blk_rq_bytes(req) >> ns->lba_shift) - 1);
+ cmnd->rw.reftag = 0;
+ cmnd->rw.apptag = 0;
+ cmnd->rw.appmask = 0;
if (req_op(req) == REQ_OP_WRITE && ctrl->nr_streams)
nvme_assign_write_stream(ctrl, req, &control, &dsmgmt);
--
2.33.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v2 1/2] nvme: move command clear into the various setup helpers
2021-10-18 12:49 ` [PATCH 1/2] nvme: move command clear into the various setup helpers Jens Axboe
@ 2021-10-18 12:53 ` Jens Axboe
2021-10-19 18:28 ` Keith Busch
0 siblings, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2021-10-18 12:53 UTC (permalink / raw)
To: linux-block; +Cc: hch
On 10/18/21 6:49 AM, Jens Axboe wrote:
> We don't have to worry about doing extra memsets by moving it outside
> the protection of RQF_DONTPREP, as nvme doesn't do partial completions.
>
> This is in preparation for making the read/write fast path not do a full
> memset of the command.
Gah, v2 of this one below, it send out an older one.
commit fb4e29f648e320c94f210c54692c754ad69fb6f6
Author: Jens Axboe <axboe@kernel.dk>
Date: Mon Oct 18 06:45:06 2021 -0600
nvme: move command clear into the various setup helpers
We don't have to worry about doing extra memsets by moving it outside
the protection of RQF_DONTPREP, as nvme doesn't do partial completions.
This is in preparation for making the read/write fast path not do a full
memset of the command.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index ae15cb714596..de2250c5b057 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -823,6 +823,7 @@ static void nvme_assign_write_stream(struct nvme_ctrl *ctrl,
static inline void nvme_setup_flush(struct nvme_ns *ns,
struct nvme_command *cmnd)
{
+ memset(cmnd, 0, sizeof(*cmnd));
cmnd->common.opcode = nvme_cmd_flush;
cmnd->common.nsid = cpu_to_le32(ns->head->ns_id);
}
@@ -874,6 +875,7 @@ static blk_status_t nvme_setup_discard(struct nvme_ns *ns, struct request *req,
return BLK_STS_IOERR;
}
+ memset(cmnd, 0, sizeof(*cmnd));
cmnd->dsm.opcode = nvme_cmd_dsm;
cmnd->dsm.nsid = cpu_to_le32(ns->head->ns_id);
cmnd->dsm.nr = cpu_to_le32(segments - 1);
@@ -890,6 +892,8 @@ static blk_status_t nvme_setup_discard(struct nvme_ns *ns, struct request *req,
static inline blk_status_t nvme_setup_write_zeroes(struct nvme_ns *ns,
struct request *req, struct nvme_command *cmnd)
{
+ memset(cmnd, 0, sizeof(*cmnd));
+
if (ns->ctrl->quirks & NVME_QUIRK_DEALLOCATE_ZEROES)
return nvme_setup_discard(ns, req, cmnd);
@@ -914,6 +918,8 @@ static inline blk_status_t nvme_setup_rw(struct nvme_ns *ns,
u16 control = 0;
u32 dsmgmt = 0;
+ memset(cmnd, 0, sizeof(*cmnd));
+
if (req->cmd_flags & REQ_FUA)
control |= NVME_RW_FUA;
if (req->cmd_flags & (REQ_FAILFAST_DEV | REQ_RAHEAD))
@@ -982,10 +988,8 @@ blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req)
struct nvme_ctrl *ctrl = nvme_req(req)->ctrl;
blk_status_t ret = BLK_STS_OK;
- if (!(req->rq_flags & RQF_DONTPREP)) {
+ if (!(req->rq_flags & RQF_DONTPREP))
nvme_clear_nvme_request(req);
- memset(cmd, 0, sizeof(*cmd));
- }
switch (req_op(req)) {
case REQ_OP_DRV_IN:
diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c
index d95010481fce..bfc259e0d7b8 100644
--- a/drivers/nvme/host/zns.c
+++ b/drivers/nvme/host/zns.c
@@ -233,6 +233,8 @@ int nvme_ns_report_zones(struct nvme_ns *ns, sector_t sector,
blk_status_t nvme_setup_zone_mgmt_send(struct nvme_ns *ns, struct request *req,
struct nvme_command *c, enum nvme_zone_mgmt_action action)
{
+ memset(c, 0, sizeof(*c));
+
c->zms.opcode = nvme_cmd_zone_mgmt_send;
c->zms.nsid = cpu_to_le32(ns->head->ns_id);
c->zms.slba = cpu_to_le64(nvme_sect_to_lba(ns, blk_rq_pos(req)));
--
Jens Axboe
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v2 1/2] nvme: move command clear into the various setup helpers
2021-10-18 12:53 ` [PATCH v2 " Jens Axboe
@ 2021-10-19 18:28 ` Keith Busch
0 siblings, 0 replies; 6+ messages in thread
From: Keith Busch @ 2021-10-19 18:28 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-block, hch
On Mon, Oct 18, 2021 at 06:53:02AM -0600, Jens Axboe wrote:
> commit fb4e29f648e320c94f210c54692c754ad69fb6f6
> Author: Jens Axboe <axboe@kernel.dk>
> Date: Mon Oct 18 06:45:06 2021 -0600
>
> nvme: move command clear into the various setup helpers
>
> We don't have to worry about doing extra memsets by moving it outside
> the protection of RQF_DONTPREP, as nvme doesn't do partial completions.
>
> This is in preparation for making the read/write fast path not do a full
> memset of the command.
>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Looks good.
Reviewed-by: Keith Busch <kbusch@kernel.org>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] nvme: don't memset() the normal read/write command
2021-10-18 12:49 ` [PATCH 2/2] nvme: don't memset() the normal read/write command Jens Axboe
@ 2021-10-19 18:29 ` Keith Busch
0 siblings, 0 replies; 6+ messages in thread
From: Keith Busch @ 2021-10-19 18:29 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-block, hch
On Mon, Oct 18, 2021 at 06:49:34AM -0600, Jens Axboe wrote:
> This memset in the fast path costs a lot of cycles on my setup. Here's a
> top-of-profile of doing ~6.7M IOPS:
>
> + 5.90% io_uring [nvme] [k] nvme_queue_rq
> + 5.32% io_uring [nvme_core] [k] nvme_setup_cmd
> + 5.17% io_uring [kernel.vmlinux] [k] io_submit_sqes
> + 4.97% io_uring [kernel.vmlinux] [k] blkdev_direct_IO
>
> and a perf diff with this patch:
>
> 0.92% +4.40% [nvme_core] [k] nvme_setup_cmd
>
> reducing it from 5.3% to only 0.9%. This takes it from the 2nd most
> cycle consumer to something that's mostly irrelevant.
>
> Acked-by: Keith Busch <kbusch@kernel.org>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Looks good.
Reviewed-by: Keith Busch <kbusch@kernel.org>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-10-19 18:29 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-18 12:49 [PATCHSET v2] nvme: don't do full memset() for command setup Jens Axboe
2021-10-18 12:49 ` [PATCH 1/2] nvme: move command clear into the various setup helpers Jens Axboe
2021-10-18 12:53 ` [PATCH v2 " Jens Axboe
2021-10-19 18:28 ` Keith Busch
2021-10-18 12:49 ` [PATCH 2/2] nvme: don't memset() the normal read/write command Jens Axboe
2021-10-19 18:29 ` Keith Busch
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).