* [PATCHSET v2] nvme: don't do full memset() for command setup @ 2021-10-18 12:49 Jens Axboe 2021-10-18 12:49 ` [PATCH 1/2] nvme: move command clear into the various setup helpers Jens Axboe 2021-10-18 12:49 ` [PATCH 2/2] nvme: don't memset() the normal read/write command Jens Axboe 0 siblings, 2 replies; 6+ messages in thread From: Jens Axboe @ 2021-10-18 12:49 UTC (permalink / raw) To: linux-block; +Cc: hch Hi, Respun this one, splitting it into two pieces. -- Jens Axboe ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/2] nvme: move command clear into the various setup helpers 2021-10-18 12:49 [PATCHSET v2] nvme: don't do full memset() for command setup Jens Axboe @ 2021-10-18 12:49 ` Jens Axboe 2021-10-18 12:53 ` [PATCH v2 " Jens Axboe 2021-10-18 12:49 ` [PATCH 2/2] nvme: don't memset() the normal read/write command Jens Axboe 1 sibling, 1 reply; 6+ messages in thread From: Jens Axboe @ 2021-10-18 12:49 UTC (permalink / raw) To: linux-block; +Cc: hch, Jens Axboe We don't have to worry about doing extra memsets by moving it outside the protection of RQF_DONTPREP, as nvme doesn't do partial completions. This is in preparation for making the read/write fast path not do a full memset of the command. Signed-off-by: Jens Axboe <axboe@kernel.dk> --- drivers/nvme/host/core.c | 11 ++++++++--- drivers/nvme/host/zns.c | 2 ++ 2 files changed, 10 insertions(+), 3 deletions(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index ae15cb714596..7944ad52f213 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -874,6 +874,7 @@ static blk_status_t nvme_setup_discard(struct nvme_ns *ns, struct request *req, return BLK_STS_IOERR; } + memset(cmnd, 0, sizeof(*cmnd)); cmnd->dsm.opcode = nvme_cmd_dsm; cmnd->dsm.nsid = cpu_to_le32(ns->head->ns_id); cmnd->dsm.nr = cpu_to_le32(segments - 1); @@ -890,6 +891,8 @@ static blk_status_t nvme_setup_discard(struct nvme_ns *ns, struct request *req, static inline blk_status_t nvme_setup_write_zeroes(struct nvme_ns *ns, struct request *req, struct nvme_command *cmnd) { + memset(cmnd, 0, sizeof(*cmnd)); + if (ns->ctrl->quirks & NVME_QUIRK_DEALLOCATE_ZEROES) return nvme_setup_discard(ns, req, cmnd); @@ -914,6 +917,8 @@ static inline blk_status_t nvme_setup_rw(struct nvme_ns *ns, u16 control = 0; u32 dsmgmt = 0; + memset(cmnd, 0, sizeof(*cmnd)); + if (req->cmd_flags & REQ_FUA) control |= NVME_RW_FUA; if (req->cmd_flags & (REQ_FAILFAST_DEV | REQ_RAHEAD)) @@ -982,17 +987,17 @@ blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req) struct nvme_ctrl *ctrl = nvme_req(req)->ctrl; blk_status_t ret = BLK_STS_OK; - if (!(req->rq_flags & RQF_DONTPREP)) { + if (!(req->rq_flags & RQF_DONTPREP)) nvme_clear_nvme_request(req); - memset(cmd, 0, sizeof(*cmd)); - } switch (req_op(req)) { case REQ_OP_DRV_IN: case REQ_OP_DRV_OUT: /* these are setup prior to execution in nvme_init_request() */ + memset(cmd, 0, sizeof(*cmd)); break; case REQ_OP_FLUSH: + memset(cmd, 0, sizeof(*cmd)); nvme_setup_flush(ns, cmd); break; case REQ_OP_ZONE_RESET_ALL: diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c index d95010481fce..bfc259e0d7b8 100644 --- a/drivers/nvme/host/zns.c +++ b/drivers/nvme/host/zns.c @@ -233,6 +233,8 @@ int nvme_ns_report_zones(struct nvme_ns *ns, sector_t sector, blk_status_t nvme_setup_zone_mgmt_send(struct nvme_ns *ns, struct request *req, struct nvme_command *c, enum nvme_zone_mgmt_action action) { + memset(c, 0, sizeof(*c)); + c->zms.opcode = nvme_cmd_zone_mgmt_send; c->zms.nsid = cpu_to_le32(ns->head->ns_id); c->zms.slba = cpu_to_le64(nvme_sect_to_lba(ns, blk_rq_pos(req))); -- 2.33.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v2 1/2] nvme: move command clear into the various setup helpers 2021-10-18 12:49 ` [PATCH 1/2] nvme: move command clear into the various setup helpers Jens Axboe @ 2021-10-18 12:53 ` Jens Axboe 2021-10-19 18:28 ` Keith Busch 0 siblings, 1 reply; 6+ messages in thread From: Jens Axboe @ 2021-10-18 12:53 UTC (permalink / raw) To: linux-block; +Cc: hch On 10/18/21 6:49 AM, Jens Axboe wrote: > We don't have to worry about doing extra memsets by moving it outside > the protection of RQF_DONTPREP, as nvme doesn't do partial completions. > > This is in preparation for making the read/write fast path not do a full > memset of the command. Gah, v2 of this one below, it send out an older one. commit fb4e29f648e320c94f210c54692c754ad69fb6f6 Author: Jens Axboe <axboe@kernel.dk> Date: Mon Oct 18 06:45:06 2021 -0600 nvme: move command clear into the various setup helpers We don't have to worry about doing extra memsets by moving it outside the protection of RQF_DONTPREP, as nvme doesn't do partial completions. This is in preparation for making the read/write fast path not do a full memset of the command. Signed-off-by: Jens Axboe <axboe@kernel.dk> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index ae15cb714596..de2250c5b057 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -823,6 +823,7 @@ static void nvme_assign_write_stream(struct nvme_ctrl *ctrl, static inline void nvme_setup_flush(struct nvme_ns *ns, struct nvme_command *cmnd) { + memset(cmnd, 0, sizeof(*cmnd)); cmnd->common.opcode = nvme_cmd_flush; cmnd->common.nsid = cpu_to_le32(ns->head->ns_id); } @@ -874,6 +875,7 @@ static blk_status_t nvme_setup_discard(struct nvme_ns *ns, struct request *req, return BLK_STS_IOERR; } + memset(cmnd, 0, sizeof(*cmnd)); cmnd->dsm.opcode = nvme_cmd_dsm; cmnd->dsm.nsid = cpu_to_le32(ns->head->ns_id); cmnd->dsm.nr = cpu_to_le32(segments - 1); @@ -890,6 +892,8 @@ static blk_status_t nvme_setup_discard(struct nvme_ns *ns, struct request *req, static inline blk_status_t nvme_setup_write_zeroes(struct nvme_ns *ns, struct request *req, struct nvme_command *cmnd) { + memset(cmnd, 0, sizeof(*cmnd)); + if (ns->ctrl->quirks & NVME_QUIRK_DEALLOCATE_ZEROES) return nvme_setup_discard(ns, req, cmnd); @@ -914,6 +918,8 @@ static inline blk_status_t nvme_setup_rw(struct nvme_ns *ns, u16 control = 0; u32 dsmgmt = 0; + memset(cmnd, 0, sizeof(*cmnd)); + if (req->cmd_flags & REQ_FUA) control |= NVME_RW_FUA; if (req->cmd_flags & (REQ_FAILFAST_DEV | REQ_RAHEAD)) @@ -982,10 +988,8 @@ blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req) struct nvme_ctrl *ctrl = nvme_req(req)->ctrl; blk_status_t ret = BLK_STS_OK; - if (!(req->rq_flags & RQF_DONTPREP)) { + if (!(req->rq_flags & RQF_DONTPREP)) nvme_clear_nvme_request(req); - memset(cmd, 0, sizeof(*cmd)); - } switch (req_op(req)) { case REQ_OP_DRV_IN: diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c index d95010481fce..bfc259e0d7b8 100644 --- a/drivers/nvme/host/zns.c +++ b/drivers/nvme/host/zns.c @@ -233,6 +233,8 @@ int nvme_ns_report_zones(struct nvme_ns *ns, sector_t sector, blk_status_t nvme_setup_zone_mgmt_send(struct nvme_ns *ns, struct request *req, struct nvme_command *c, enum nvme_zone_mgmt_action action) { + memset(c, 0, sizeof(*c)); + c->zms.opcode = nvme_cmd_zone_mgmt_send; c->zms.nsid = cpu_to_le32(ns->head->ns_id); c->zms.slba = cpu_to_le64(nvme_sect_to_lba(ns, blk_rq_pos(req))); -- Jens Axboe ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v2 1/2] nvme: move command clear into the various setup helpers 2021-10-18 12:53 ` [PATCH v2 " Jens Axboe @ 2021-10-19 18:28 ` Keith Busch 0 siblings, 0 replies; 6+ messages in thread From: Keith Busch @ 2021-10-19 18:28 UTC (permalink / raw) To: Jens Axboe; +Cc: linux-block, hch On Mon, Oct 18, 2021 at 06:53:02AM -0600, Jens Axboe wrote: > commit fb4e29f648e320c94f210c54692c754ad69fb6f6 > Author: Jens Axboe <axboe@kernel.dk> > Date: Mon Oct 18 06:45:06 2021 -0600 > > nvme: move command clear into the various setup helpers > > We don't have to worry about doing extra memsets by moving it outside > the protection of RQF_DONTPREP, as nvme doesn't do partial completions. > > This is in preparation for making the read/write fast path not do a full > memset of the command. > > Signed-off-by: Jens Axboe <axboe@kernel.dk> Looks good. Reviewed-by: Keith Busch <kbusch@kernel.org> ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 2/2] nvme: don't memset() the normal read/write command 2021-10-18 12:49 [PATCHSET v2] nvme: don't do full memset() for command setup Jens Axboe 2021-10-18 12:49 ` [PATCH 1/2] nvme: move command clear into the various setup helpers Jens Axboe @ 2021-10-18 12:49 ` Jens Axboe 2021-10-19 18:29 ` Keith Busch 1 sibling, 1 reply; 6+ messages in thread From: Jens Axboe @ 2021-10-18 12:49 UTC (permalink / raw) To: linux-block; +Cc: hch, Jens Axboe, Keith Busch This memset in the fast path costs a lot of cycles on my setup. Here's a top-of-profile of doing ~6.7M IOPS: + 5.90% io_uring [nvme] [k] nvme_queue_rq + 5.32% io_uring [nvme_core] [k] nvme_setup_cmd + 5.17% io_uring [kernel.vmlinux] [k] io_submit_sqes + 4.97% io_uring [kernel.vmlinux] [k] blkdev_direct_IO and a perf diff with this patch: 0.92% +4.40% [nvme_core] [k] nvme_setup_cmd reducing it from 5.3% to only 0.9%. This takes it from the 2nd most cycle consumer to something that's mostly irrelevant. Acked-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> --- drivers/nvme/host/core.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 7944ad52f213..3e691354598c 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -917,8 +917,6 @@ static inline blk_status_t nvme_setup_rw(struct nvme_ns *ns, u16 control = 0; u32 dsmgmt = 0; - memset(cmnd, 0, sizeof(*cmnd)); - if (req->cmd_flags & REQ_FUA) control |= NVME_RW_FUA; if (req->cmd_flags & (REQ_FAILFAST_DEV | REQ_RAHEAD)) @@ -928,9 +926,15 @@ static inline blk_status_t nvme_setup_rw(struct nvme_ns *ns, dsmgmt |= NVME_RW_DSM_FREQ_PREFETCH; cmnd->rw.opcode = op; + cmnd->rw.flags = 0; cmnd->rw.nsid = cpu_to_le32(ns->head->ns_id); + cmnd->rw.rsvd2 = 0; + cmnd->rw.metadata = 0; cmnd->rw.slba = cpu_to_le64(nvme_sect_to_lba(ns, blk_rq_pos(req))); cmnd->rw.length = cpu_to_le16((blk_rq_bytes(req) >> ns->lba_shift) - 1); + cmnd->rw.reftag = 0; + cmnd->rw.apptag = 0; + cmnd->rw.appmask = 0; if (req_op(req) == REQ_OP_WRITE && ctrl->nr_streams) nvme_assign_write_stream(ctrl, req, &control, &dsmgmt); -- 2.33.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] nvme: don't memset() the normal read/write command 2021-10-18 12:49 ` [PATCH 2/2] nvme: don't memset() the normal read/write command Jens Axboe @ 2021-10-19 18:29 ` Keith Busch 0 siblings, 0 replies; 6+ messages in thread From: Keith Busch @ 2021-10-19 18:29 UTC (permalink / raw) To: Jens Axboe; +Cc: linux-block, hch On Mon, Oct 18, 2021 at 06:49:34AM -0600, Jens Axboe wrote: > This memset in the fast path costs a lot of cycles on my setup. Here's a > top-of-profile of doing ~6.7M IOPS: > > + 5.90% io_uring [nvme] [k] nvme_queue_rq > + 5.32% io_uring [nvme_core] [k] nvme_setup_cmd > + 5.17% io_uring [kernel.vmlinux] [k] io_submit_sqes > + 4.97% io_uring [kernel.vmlinux] [k] blkdev_direct_IO > > and a perf diff with this patch: > > 0.92% +4.40% [nvme_core] [k] nvme_setup_cmd > > reducing it from 5.3% to only 0.9%. This takes it from the 2nd most > cycle consumer to something that's mostly irrelevant. > > Acked-by: Keith Busch <kbusch@kernel.org> > Signed-off-by: Jens Axboe <axboe@kernel.dk> Looks good. Reviewed-by: Keith Busch <kbusch@kernel.org> ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-10-19 18:29 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-10-18 12:49 [PATCHSET v2] nvme: don't do full memset() for command setup Jens Axboe 2021-10-18 12:49 ` [PATCH 1/2] nvme: move command clear into the various setup helpers Jens Axboe 2021-10-18 12:53 ` [PATCH v2 " Jens Axboe 2021-10-19 18:28 ` Keith Busch 2021-10-18 12:49 ` [PATCH 2/2] nvme: don't memset() the normal read/write command Jens Axboe 2021-10-19 18:29 ` Keith Busch
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).