linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/14] nvmet: passthru few fixes and improvements
@ 2020-08-10 18:54 Chaitanya Kulkarni
  2020-08-10 18:54 ` [PATCH 01/14] nvmet: for pt I/O cmds annotate req->sg_cnt likely Chaitanya Kulkarni
                   ` (13 more replies)
  0 siblings, 14 replies; 53+ messages in thread
From: Chaitanya Kulkarni @ 2020-08-10 18:54 UTC (permalink / raw)
  To: linux-nvme; +Cc: hch, Chaitanya Kulkarni, sagi

Hi,

Over the weekend while testing I found few fixes and improvements over
passthru. Following is the patch distribution:-

* PATCH 1-7 Adds annotations, fixes type mismatch, trim downs the code &
            uses nvmet_passthru_req_done() instead of duplicating the 
            function. No functionality change.

* Patch 8-9 Updates the flags setting for write commands in fast path.

* PATCH 10-11 Fixes the module ref count Oops in load-unload sequence. 

* PATCH 12-13 Trims down the bio req append n the context of passthru in
              fast path. No functionality change, perf numbers does show
              improvement.

* PATCH 14 Avoids an allocation and uses an inline-bvec-bio combination
           for fast-path when possible.      

I can see nvme-5.9 and nvme-5.9-rc branch, I've generated this series on
nvme-5.9 + http://lists.infradead.org/pipermail/linux-nvme/2020-August/018708.html.

Regards,
Chaitanya

Chaitanya Kulkarni (14):
  nvmet: for pt I/O cmds annotate req->sg_cnt likely
  nvmet: for pt I/O commands use likely for ns check
  nvmet: use consistent type with id->nlbaf
  nvmet: use consistent type for op_flag
  nvmet: get rid of the extra variable
  nvmet: use unlikely for uncommon commands
  nvmet: reuse pt-req-done for rq completion
  nvmet: use nvme write cmd group setting op_flag
  nvmet: based on passthru req control use REQ_FUA
  nvmet: prep patch for pt ctrl put wrapper
  nvmet: fix nvme module ref count Oops
  block: move blk_rq_bio_prep() to linux/blk-mq.h
  nvmet: use minimized version of blk_rq_append_bio
  nvmet: use inline bio for passthru fast path

 block/blk.h                    |  12 ----
 drivers/nvme/host/core.c       |   5 ++
 drivers/nvme/target/nvmet.h    |   1 +
 drivers/nvme/target/passthru.c | 109 +++++++++++++++++++++++----------
 include/linux/blk-mq.h         |  12 ++++
 5 files changed, 93 insertions(+), 46 deletions(-)

-- 
2.22.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH 01/14] nvmet: for pt I/O cmds annotate req->sg_cnt likely
  2020-08-10 18:54 [PATCH 00/14] nvmet: passthru few fixes and improvements Chaitanya Kulkarni
@ 2020-08-10 18:54 ` Chaitanya Kulkarni
  2020-08-10 21:54   ` Logan Gunthorpe
  2020-08-10 18:54 ` [PATCH 02/14] nvmet: for pt I/O commands use likely for ns check Chaitanya Kulkarni
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 53+ messages in thread
From: Chaitanya Kulkarni @ 2020-08-10 18:54 UTC (permalink / raw)
  To: linux-nvme; +Cc: hch, Chaitanya Kulkarni, sagi

I/O Commands (nvme_cmd_read, nvme_cmd_write) are most command commands
when accessing passthru controller. For I/O commands data is represented
in SG list, so sg_cnt != 0 in most cases.

Annotate req->sg_cnt check with likely.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/nvme/target/passthru.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/target/passthru.c b/drivers/nvme/target/passthru.c
index 68701754b036..fd1f754de5b4 100644
--- a/drivers/nvme/target/passthru.c
+++ b/drivers/nvme/target/passthru.c
@@ -242,7 +242,7 @@ static void nvmet_passthru_execute_cmd(struct nvmet_req *req)
 		goto out_put_ns;
 	}
 
-	if (req->sg_cnt) {
+	if (likely(req->sg_cnt)) {
 		ret = nvmet_passthru_map_sg(req, rq);
 		if (unlikely(ret)) {
 			status = NVME_SC_INTERNAL;
-- 
2.22.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 02/14] nvmet: for pt I/O commands use likely for ns check
  2020-08-10 18:54 [PATCH 00/14] nvmet: passthru few fixes and improvements Chaitanya Kulkarni
  2020-08-10 18:54 ` [PATCH 01/14] nvmet: for pt I/O cmds annotate req->sg_cnt likely Chaitanya Kulkarni
@ 2020-08-10 18:54 ` Chaitanya Kulkarni
  2020-08-10 18:54 ` [PATCH 03/14] nvmet: use consistent type with id->nlbaf Chaitanya Kulkarni
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 53+ messages in thread
From: Chaitanya Kulkarni @ 2020-08-10 18:54 UTC (permalink / raw)
  To: linux-nvme; +Cc: hch, Chaitanya Kulkarni, sagi

I/O commands (nvme_cmd_read, nvme_cmd_write) are most common commands
when accessing passthru controller. For I/O commands ns is always
present. Annotate post request submission ns check with likely.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/nvme/target/passthru.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/target/passthru.c b/drivers/nvme/target/passthru.c
index fd1f754de5b4..b675662654ae 100644
--- a/drivers/nvme/target/passthru.c
+++ b/drivers/nvme/target/passthru.c
@@ -268,7 +268,7 @@ static void nvmet_passthru_execute_cmd(struct nvmet_req *req)
 				      nvmet_passthru_req_done);
 	}
 
-	if (ns)
+	if (likely(ns))
 		nvme_put_ns(ns);
 
 	return;
-- 
2.22.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 03/14] nvmet: use consistent type with id->nlbaf
  2020-08-10 18:54 [PATCH 00/14] nvmet: passthru few fixes and improvements Chaitanya Kulkarni
  2020-08-10 18:54 ` [PATCH 01/14] nvmet: for pt I/O cmds annotate req->sg_cnt likely Chaitanya Kulkarni
  2020-08-10 18:54 ` [PATCH 02/14] nvmet: for pt I/O commands use likely for ns check Chaitanya Kulkarni
@ 2020-08-10 18:54 ` Chaitanya Kulkarni
  2020-08-10 21:55   ` Logan Gunthorpe
  2020-08-10 18:54 ` [PATCH 04/14] nvmet: use consistent type for op_flag Chaitanya Kulkarni
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 53+ messages in thread
From: Chaitanya Kulkarni @ 2020-08-10 18:54 UTC (permalink / raw)
  To: linux-nvme; +Cc: hch, Chaitanya Kulkarni, sagi

In function nvmet_passthru_override_id_ns() while iterating
over namespace lba format loop variable is declared as int that is
inconsistent with the id->nlbaf type which is u8.

Make loop variable of the same type as id->nlbaf.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/nvme/target/passthru.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/target/passthru.c b/drivers/nvme/target/passthru.c
index b675662654ae..13d73f36e927 100644
--- a/drivers/nvme/target/passthru.c
+++ b/drivers/nvme/target/passthru.c
@@ -112,7 +112,7 @@ static u16 nvmet_passthru_override_id_ns(struct nvmet_req *req)
 {
 	u16 status = NVME_SC_SUCCESS;
 	struct nvme_id_ns *id;
-	int i;
+	u8 i;
 
 	id = kzalloc(sizeof(*id), GFP_KERNEL);
 	if (!id)
-- 
2.22.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 04/14] nvmet: use consistent type for op_flag
  2020-08-10 18:54 [PATCH 00/14] nvmet: passthru few fixes and improvements Chaitanya Kulkarni
                   ` (2 preceding siblings ...)
  2020-08-10 18:54 ` [PATCH 03/14] nvmet: use consistent type with id->nlbaf Chaitanya Kulkarni
@ 2020-08-10 18:54 ` Chaitanya Kulkarni
  2020-08-10 21:56   ` Logan Gunthorpe
  2020-08-10 18:54 ` [PATCH 05/14] nvmet: get rid of the extra variable Chaitanya Kulkarni
                   ` (9 subsequent siblings)
  13 siblings, 1 reply; 53+ messages in thread
From: Chaitanya Kulkarni @ 2020-08-10 18:54 UTC (permalink / raw)
  To: linux-nvme; +Cc: hch, Chaitanya Kulkarni, sagi

In function nvmet_passthru_map_sg() we set the bio->bi_opf ored with
req_op and op_flags set based on the NVMe cmd. The variable bio->bi_opf
is declared as unsigned int use same type for op_flag.

Also, adjust the indentation according to new length.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/nvme/target/passthru.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/target/passthru.c b/drivers/nvme/target/passthru.c
index 13d73f36e927..58377420874f 100644
--- a/drivers/nvme/target/passthru.c
+++ b/drivers/nvme/target/passthru.c
@@ -180,9 +180,9 @@ static void nvmet_passthru_req_done(struct request *rq,
 
 static int nvmet_passthru_map_sg(struct nvmet_req *req, struct request *rq)
 {
+	unsigned int op_flags = 0;
 	int sg_cnt = req->sg_cnt;
 	struct scatterlist *sg;
-	int op_flags = 0;
 	struct bio *bio;
 	int i, ret;
 
-- 
2.22.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 05/14] nvmet: get rid of the extra variable
  2020-08-10 18:54 [PATCH 00/14] nvmet: passthru few fixes and improvements Chaitanya Kulkarni
                   ` (3 preceding siblings ...)
  2020-08-10 18:54 ` [PATCH 04/14] nvmet: use consistent type for op_flag Chaitanya Kulkarni
@ 2020-08-10 18:54 ` Chaitanya Kulkarni
  2020-08-10 21:58   ` Logan Gunthorpe
  2020-08-10 18:54 ` [PATCH 06/14] nvmet: use unlikely for uncommon commands Chaitanya Kulkarni
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 53+ messages in thread
From: Chaitanya Kulkarni @ 2020-08-10 18:54 UTC (permalink / raw)
  To: linux-nvme; +Cc: hch, Chaitanya Kulkarni, sagi

In function nvmet_passthru_execute_cmd() the return value of the
function nvmet_passthru_map_sg() is not used to determine stauts i.e.
we can safely ignore the actual cause of error that is returned and just
check for binary status pass or fail.

The prior check to this function call can be combined safely with the &&
so that we can get rid of the extra variable and second if condition.

Remove the extra variable and adjust the code accordingly to have only
one if condition.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/nvme/target/passthru.c | 10 +++-------
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/drivers/nvme/target/passthru.c b/drivers/nvme/target/passthru.c
index 58377420874f..9a175b8b9d57 100644
--- a/drivers/nvme/target/passthru.c
+++ b/drivers/nvme/target/passthru.c
@@ -221,7 +221,6 @@ static void nvmet_passthru_execute_cmd(struct nvmet_req *req)
 	struct request *rq = NULL;
 	u32 effects;
 	u16 status;
-	int ret;
 
 	if (likely(req->sq->qid != 0)) {
 		u32 nsid = le32_to_cpu(req->cmd->common.nsid);
@@ -242,12 +241,9 @@ static void nvmet_passthru_execute_cmd(struct nvmet_req *req)
 		goto out_put_ns;
 	}
 
-	if (likely(req->sg_cnt)) {
-		ret = nvmet_passthru_map_sg(req, rq);
-		if (unlikely(ret)) {
-			status = NVME_SC_INTERNAL;
-			goto out_put_req;
-		}
+	if (likely(req->sg_cnt) && unlikely(nvmet_passthru_map_sg(req, rq))) {
+		status = NVME_SC_INTERNAL;
+		goto out_put_req;
 	}
 
 	/*
-- 
2.22.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 06/14] nvmet: use unlikely for uncommon commands
  2020-08-10 18:54 [PATCH 00/14] nvmet: passthru few fixes and improvements Chaitanya Kulkarni
                   ` (4 preceding siblings ...)
  2020-08-10 18:54 ` [PATCH 05/14] nvmet: get rid of the extra variable Chaitanya Kulkarni
@ 2020-08-10 18:54 ` Chaitanya Kulkarni
  2020-08-10 20:46   ` Keith Busch
  2020-08-10 18:54 ` [PATCH 07/14] nvmet: reuse pt-req-done for rq completion Chaitanya Kulkarni
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 53+ messages in thread
From: Chaitanya Kulkarni @ 2020-08-10 18:54 UTC (permalink / raw)
  To: linux-nvme; +Cc: hch, Chaitanya Kulkarni, sagi

I/O commands (nvme_cmd_read, nvme_cmd_write) are most common commands
when accessing passthru controller, most controllers should not
have set the effects for r/w I/O commnds (atleast I don't know at this
moment). Also, check for req->p.use_workqueue is true in two admin
commands only which are low frequency commands.

Annotate use_workqueue and command effects check with unlikely in the
fast path.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/nvme/target/passthru.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/target/passthru.c b/drivers/nvme/target/passthru.c
index 9a175b8b9d57..00270e20c83a 100644
--- a/drivers/nvme/target/passthru.c
+++ b/drivers/nvme/target/passthru.c
@@ -254,7 +254,7 @@ static void nvmet_passthru_execute_cmd(struct nvmet_req *req)
 	 * which is typically in interrupt context.
 	 */
 	effects = nvme_command_effects(ctrl, ns, req->cmd->common.opcode);
-	if (req->p.use_workqueue || effects) {
+	if (unlikely(req->p.use_workqueue || effects)) {
 		INIT_WORK(&req->p.work, nvmet_passthru_execute_cmd_work);
 		req->p.rq = rq;
 		schedule_work(&req->p.work);
-- 
2.22.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 07/14] nvmet: reuse pt-req-done for rq completion
  2020-08-10 18:54 [PATCH 00/14] nvmet: passthru few fixes and improvements Chaitanya Kulkarni
                   ` (5 preceding siblings ...)
  2020-08-10 18:54 ` [PATCH 06/14] nvmet: use unlikely for uncommon commands Chaitanya Kulkarni
@ 2020-08-10 18:54 ` Chaitanya Kulkarni
  2020-08-10 22:10   ` Logan Gunthorpe
  2020-08-10 18:54 ` [PATCH 08/14] nvmet: use nvme write cmd group setting op_flag Chaitanya Kulkarni
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 53+ messages in thread
From: Chaitanya Kulkarni @ 2020-08-10 18:54 UTC (permalink / raw)
  To: linux-nvme; +Cc: hch, Chaitanya Kulkarni, sagi

In the function nvmet_passthru_execute_cmd_work() after overriding
id_[ctr|ns] results it completes the request by assigning the target
request completion queue entry result from nvme request result,
actually completing request on the transport and calling
blk_mq_free_request().

The function nvmet_passhru_req_done does the same thing, use that and
remove the duplicate code in nvmet_passhru_execute_cmd_work().

The last parameter for the nvmet_passthru_req_done() which is
blk_status_t is not used since we derive the result from the nvme_req.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/nvme/target/passthru.c | 24 +++++++++++-------------
 1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/drivers/nvme/target/passthru.c b/drivers/nvme/target/passthru.c
index 00270e20c83a..0b18038c44bb 100644
--- a/drivers/nvme/target/passthru.c
+++ b/drivers/nvme/target/passthru.c
@@ -142,6 +142,15 @@ static u16 nvmet_passthru_override_id_ns(struct nvmet_req *req)
 	return status;
 }
 
+static void nvmet_passthru_req_done(struct request *rq, blk_status_t blk_sts)
+{
+	struct nvmet_req *req = rq->end_io_data;
+
+	req->cqe->result = nvme_req(rq)->result;
+	nvmet_req_complete(req, nvme_req(rq)->status);
+	blk_mq_free_request(rq);
+}
+
 static void nvmet_passthru_execute_cmd_work(struct work_struct *w)
 {
 	struct nvmet_req *req = container_of(w, struct nvmet_req, p.work);
@@ -163,19 +172,8 @@ static void nvmet_passthru_execute_cmd_work(struct work_struct *w)
 		}
 	}
 
-	req->cqe->result = nvme_req(rq)->result;
-	nvmet_req_complete(req, status);
-	blk_mq_free_request(rq);
-}
-
-static void nvmet_passthru_req_done(struct request *rq,
-				    blk_status_t blk_status)
-{
-	struct nvmet_req *req = rq->end_io_data;
-
-	req->cqe->result = nvme_req(rq)->result;
-	nvmet_req_complete(req, nvme_req(rq)->status);
-	blk_mq_free_request(rq);
+	rq->end_io_data = req;
+	nvmet_passthru_req_done(rq, 0 /* this value is not used for request */);
 }
 
 static int nvmet_passthru_map_sg(struct nvmet_req *req, struct request *rq)
-- 
2.22.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 08/14] nvmet: use nvme write cmd group setting op_flag
  2020-08-10 18:54 [PATCH 00/14] nvmet: passthru few fixes and improvements Chaitanya Kulkarni
                   ` (6 preceding siblings ...)
  2020-08-10 18:54 ` [PATCH 07/14] nvmet: reuse pt-req-done for rq completion Chaitanya Kulkarni
@ 2020-08-10 18:54 ` Chaitanya Kulkarni
  2020-08-10 21:11   ` Keith Busch
  2020-08-10 22:14   ` Logan Gunthorpe
  2020-08-10 18:54 ` [PATCH 09/14] nvmet: based on passthru req control use REQ_FUA Chaitanya Kulkarni
                   ` (5 subsequent siblings)
  13 siblings, 2 replies; 53+ messages in thread
From: Chaitanya Kulkarni @ 2020-08-10 18:54 UTC (permalink / raw)
  To: linux-nvme; +Cc: hch, Chaitanya Kulkarni, sagi

The function nvme_is_write() only checks the last bit of the opcode to
determine if command is a write command or not based on that it sets
the bio->bi_opf for passthru bio with REQ_SYNC | REQ_IDLE.

The condition is true for nvme_cmd_write (opcode 0x01), but there are
commands present in the NVMe command set and are part for NVM Write
Command Group that doesn't have a last bit set. e.g.
nvme_cmd_write_zeroes 0x08. With current implementation for these
commands we will never set the REQ_SYNC for bio->bi_opf even though
command is from NVMe write command set.

For passthru backend introduce nvmet_write_cmd_group() and use the
commands from NVMe 1.4 spec write command group to set the REQ_SYNC
flag for passthru bio and update nvmet_passthru_map_sg accordingly.
Since these are low frequency commands annotate condition with
unlikely respectively.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/nvme/target/passthru.c | 31 ++++++++++++++++++++++++++++---
 1 file changed, 28 insertions(+), 3 deletions(-)

diff --git a/drivers/nvme/target/passthru.c b/drivers/nvme/target/passthru.c
index 0b18038c44bb..5f06f4290c52 100644
--- a/drivers/nvme/target/passthru.c
+++ b/drivers/nvme/target/passthru.c
@@ -176,6 +176,29 @@ static void nvmet_passthru_execute_cmd_work(struct work_struct *w)
 	nvmet_passthru_req_done(rq, 0 /* this value is not used for request */);
 }
 
+static inline bool nvmet_write_cmd_group(__u8 opcode)
+{
+	bool ret;
+
+	/* NVM Express Revision 1.4 Figure 462. */
+	switch (opcode) {
+	case nvme_admin_format_nvm:
+	case nvme_admin_ns_attach:
+	case nvme_admin_ns_mgmt:
+	case nvme_admin_sanitize_nvm:
+	case nvme_admin_security_send:
+	case nvme_cmd_write_zeroes:
+	case nvme_cmd_write_uncor:
+		ret = true;
+		break;
+	default:
+		ret = false;
+		break;
+	}
+
+	return ret;
+}
+
 static int nvmet_passthru_map_sg(struct nvmet_req *req, struct request *rq)
 {
 	unsigned int op_flags = 0;
@@ -184,10 +207,12 @@ static int nvmet_passthru_map_sg(struct nvmet_req *req, struct request *rq)
 	struct bio *bio;
 	int i, ret;
 
-	if (req->cmd->common.opcode == nvme_cmd_flush)
-		op_flags = REQ_FUA;
-	else if (nvme_is_write(req->cmd))
+	if (nvme_is_write(req->cmd))
 		op_flags = REQ_SYNC | REQ_IDLE;
+	else if (unlikely(nvmet_write_cmd_group(req->cmd->common.opcode)))
+		op_flags = REQ_SYNC;
+	else if (unlikely(req->cmd->common.opcode == nvme_cmd_flush))
+		op_flags = REQ_FUA;
 
 	bio = bio_alloc(GFP_KERNEL, min(sg_cnt, BIO_MAX_PAGES));
 	bio->bi_end_io = bio_put;
-- 
2.22.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 09/14] nvmet: based on passthru req control use REQ_FUA
  2020-08-10 18:54 [PATCH 00/14] nvmet: passthru few fixes and improvements Chaitanya Kulkarni
                   ` (7 preceding siblings ...)
  2020-08-10 18:54 ` [PATCH 08/14] nvmet: use nvme write cmd group setting op_flag Chaitanya Kulkarni
@ 2020-08-10 18:54 ` Chaitanya Kulkarni
  2020-08-14 10:37   ` Christoph Hellwig
  2020-08-10 18:54 ` [PATCH 10/14] nvmet: prep patch for pt ctrl put wrapper Chaitanya Kulkarni
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 53+ messages in thread
From: Chaitanya Kulkarni @ 2020-08-10 18:54 UTC (permalink / raw)
  To: linux-nvme; +Cc: hch, Chaitanya Kulkarni, sagi

The write request for passthru driver currently does not consider
nvme control field which could be set in the nvme_rw_command from host
side.

Update op_flags with REQ_FUA when control field flag is set to
NVME_RW_FUA.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/nvme/target/passthru.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/nvme/target/passthru.c b/drivers/nvme/target/passthru.c
index 5f06f4290c52..a703f3f14b35 100644
--- a/drivers/nvme/target/passthru.c
+++ b/drivers/nvme/target/passthru.c
@@ -207,11 +207,15 @@ static int nvmet_passthru_map_sg(struct nvmet_req *req, struct request *rq)
 	struct bio *bio;
 	int i, ret;
 
-	if (nvme_is_write(req->cmd))
+	if (nvme_is_write(req->cmd)) {
 		op_flags = REQ_SYNC | REQ_IDLE;
-	else if (unlikely(nvmet_write_cmd_group(req->cmd->common.opcode)))
+		if (req->cmd->rw.control & cpu_to_le16(NVME_RW_FUA))
+			op_flags |= REQ_FUA;
+	} else if (unlikely(nvmet_write_cmd_group(req->cmd->common.opcode))) {
 		op_flags = REQ_SYNC;
-	else if (unlikely(req->cmd->common.opcode == nvme_cmd_flush))
+		if (req->cmd->rw.control & cpu_to_le16(NVME_RW_FUA))
+			op_flags |= REQ_FUA;
+	} else if (unlikely(req->cmd->common.opcode == nvme_cmd_flush))
 		op_flags = REQ_FUA;
 
 	bio = bio_alloc(GFP_KERNEL, min(sg_cnt, BIO_MAX_PAGES));
-- 
2.22.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 10/14] nvmet: prep patch for pt ctrl put wrapper
  2020-08-10 18:54 [PATCH 00/14] nvmet: passthru few fixes and improvements Chaitanya Kulkarni
                   ` (8 preceding siblings ...)
  2020-08-10 18:54 ` [PATCH 09/14] nvmet: based on passthru req control use REQ_FUA Chaitanya Kulkarni
@ 2020-08-10 18:54 ` Chaitanya Kulkarni
  2020-08-10 18:54 ` [PATCH 11/14] nvmet: fix nvme module ref count Oops Chaitanya Kulkarni
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 53+ messages in thread
From: Chaitanya Kulkarni @ 2020-08-10 18:54 UTC (permalink / raw)
  To: linux-nvme; +Cc: hch, Chaitanya Kulkarni, sagi

This is a preparation patch to cnetralize the passthru put controller
action. Next patch actully fixes the bug in passthru code which needs
this helper to avoid the duplication of the code.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/nvme/target/passthru.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/target/passthru.c b/drivers/nvme/target/passthru.c
index a703f3f14b35..32f4951a1df7 100644
--- a/drivers/nvme/target/passthru.c
+++ b/drivers/nvme/target/passthru.c
@@ -486,6 +486,11 @@ u16 nvmet_parse_passthru_admin_cmd(struct nvmet_req *req)
 	}
 }
 
+static void nvme_pt_put_ctrl(struct nvme_ctrl *ctrl)
+{
+	nvme_put_ctrl(ctrl);
+}
+
 int nvmet_passthru_ctrl_enable(struct nvmet_subsys *subsys)
 {
 	struct nvme_ctrl *ctrl;
@@ -536,7 +541,7 @@ int nvmet_passthru_ctrl_enable(struct nvmet_subsys *subsys)
 	return 0;
 
 out_put_ctrl:
-	nvme_put_ctrl(ctrl);
+	nvme_pt_put_ctrl(ctrl);
 out_unlock:
 	mutex_unlock(&subsys->lock);
 	return ret;
@@ -546,7 +551,7 @@ static void __nvmet_passthru_ctrl_disable(struct nvmet_subsys *subsys)
 {
 	if (subsys->passthru_ctrl) {
 		xa_erase(&passthru_subsystems, subsys->passthru_ctrl->cntlid);
-		nvme_put_ctrl(subsys->passthru_ctrl);
+		nvme_pt_put_ctrl(subsys->passthru_ctrl);
 	}
 	subsys->passthru_ctrl = NULL;
 	subsys->ver = NVMET_DEFAULT_VS;
-- 
2.22.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 11/14] nvmet: fix nvme module ref count Oops
  2020-08-10 18:54 [PATCH 00/14] nvmet: passthru few fixes and improvements Chaitanya Kulkarni
                   ` (9 preceding siblings ...)
  2020-08-10 18:54 ` [PATCH 10/14] nvmet: prep patch for pt ctrl put wrapper Chaitanya Kulkarni
@ 2020-08-10 18:54 ` Chaitanya Kulkarni
  2020-08-10 22:39   ` Logan Gunthorpe
  2020-08-10 18:54 ` [PATCH 12/14] block: move blk_rq_bio_prep() to linux/blk-mq.h Chaitanya Kulkarni
                   ` (2 subsequent siblings)
  13 siblings, 1 reply; 53+ messages in thread
From: Chaitanya Kulkarni @ 2020-08-10 18:54 UTC (permalink / raw)
  To: linux-nvme; +Cc: hch, Chaitanya Kulkarni, sagi

In the passthru controller enable path current code doesn't take the
reference to the passthru ctrl module. Which produces following Oops :-

Entering kdb (current=0xffff8887f8290000, pid 3128) on processor 30 Oops: (null)
due to oops @ 0xffffffffa01019ad
CPU: 30 PID: 3128 Comm: bash Tainted: G        W  OE     5.8.0-rc4nvme-5.9+ #35
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.4
RIP: 0010:nvme_free_ctrl+0x234/0x285 [nvme_core]
Code: 57 10 a0 e8 73 bf 02 e1 ba 3d 11 00 00 48 c7 c6 98 33 10 a0 48 c7 c7 1d 57 10 a0 e8 5b bf 02 e1 8
RSP: 0018:ffffc90001d63de0 EFLAGS: 00010246
RAX: ffffffffa05c0440 RBX: ffff8888119e45a0 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff8888177e9550 RDI: ffff8888119e43b0
RBP: ffff8887d4768000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: ffffc90001d63c90 R12: ffff8888119e43b0
R13: ffff8888119e5108 R14: dead000000000100 R15: ffff8888119e5108
FS:  00007f1ef27b0740(0000) GS:ffff888817600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffa05c0470 CR3: 00000007f6bee000 CR4: 00000000003406e0
Call Trace:
 device_release+0x27/0x80
 kobject_put+0x98/0x170
 nvmet_passthru_ctrl_disable+0x4a/0x70 [nvmet]
 nvmet_passthru_enable_store+0x4c/0x90 [nvmet]
 configfs_write_file+0xe6/0x150
 vfs_write+0xba/0x1e0
 ksys_write+0x5f/0xe0
 do_syscall_64+0x52/0xb0
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f1ef1eb2840
Code: Bad RIP value.
RSP: 002b:00007fffdbff0eb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f1ef1eb2840
RDX: 0000000000000002 RSI: 00007f1ef27d2000 RDI: 0000000000000001
RBP: 00007f1ef27d2000 R08: 000000000000000a R09: 00007f1ef27b0740
R10: 0000000000000001 R11: 0000000000000246 R12: 00007f1ef2186400
R13: 0000000000000002 R14: 0000000000000001 R15: 0000000000000000

We fix that by taking a module ref count in nvme_get_ctrl() and release
that ref count in nvme_pt_put_ctrl().

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/nvme/host/core.c       | 5 +++++
 drivers/nvme/target/passthru.c | 1 +
 2 files changed, 6 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 05aa568a60af..ba019c0b514b 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4608,6 +4608,11 @@ struct nvme_ctrl *nvme_ctrl_get_by_path(const char *path)
 
 	ctrl = f->private_data;
 	nvme_get_ctrl(ctrl);
+	if (!try_module_get(ctrl->ops->module)) {
+		pr_err("try_module_get failed\n");
+		nvme_put_ctrl(ctrl);
+		ctrl = ERR_PTR(-EINVAL);
+	}
 
 out_close:
 	filp_close(f, NULL);
diff --git a/drivers/nvme/target/passthru.c b/drivers/nvme/target/passthru.c
index 32f4951a1df7..5925e93c4682 100644
--- a/drivers/nvme/target/passthru.c
+++ b/drivers/nvme/target/passthru.c
@@ -488,6 +488,7 @@ u16 nvmet_parse_passthru_admin_cmd(struct nvmet_req *req)
 
 static void nvme_pt_put_ctrl(struct nvme_ctrl *ctrl)
 {
+	module_put(ctrl->ops->module);
 	nvme_put_ctrl(ctrl);
 }
 
-- 
2.22.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 12/14] block: move blk_rq_bio_prep() to linux/blk-mq.h
  2020-08-10 18:54 [PATCH 00/14] nvmet: passthru few fixes and improvements Chaitanya Kulkarni
                   ` (10 preceding siblings ...)
  2020-08-10 18:54 ` [PATCH 11/14] nvmet: fix nvme module ref count Oops Chaitanya Kulkarni
@ 2020-08-10 18:54 ` Chaitanya Kulkarni
  2020-08-10 18:54 ` [PATCH 13/14] nvmet: use minimized version of blk_rq_append_bio Chaitanya Kulkarni
  2020-08-10 18:54 ` [PATCH 14/14] nvmet: use inline bio for passthru fast path Chaitanya Kulkarni
  13 siblings, 0 replies; 53+ messages in thread
From: Chaitanya Kulkarni @ 2020-08-10 18:54 UTC (permalink / raw)
  To: linux-nvme; +Cc: hch, Chaitanya Kulkarni, sagi

This is a preparation patch to have minimal block layer request bio
append functionality in the context of the NVMeOF Passthru driver which
falls in the fast path.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 block/blk.h            | 12 ------------
 include/linux/blk-mq.h | 12 ++++++++++++
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/block/blk.h b/block/blk.h
index 94f7c084f68f..cd107e60144c 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -96,18 +96,6 @@ static inline bool bvec_gap_to_prev(struct request_queue *q,
 	return __bvec_gap_to_prev(q, bprv, offset);
 }
 
-static inline void blk_rq_bio_prep(struct request *rq, struct bio *bio,
-		unsigned int nr_segs)
-{
-	rq->nr_phys_segments = nr_segs;
-	rq->__data_len = bio->bi_iter.bi_size;
-	rq->bio = rq->biotail = bio;
-	rq->ioprio = bio_prio(bio);
-
-	if (bio->bi_disk)
-		rq->rq_disk = bio->bi_disk;
-}
-
 #ifdef CONFIG_BLK_DEV_INTEGRITY
 void blk_flush_integrity(void);
 bool __bio_integrity_endio(struct bio *);
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 23230c1d031e..be7528231157 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -596,6 +596,18 @@ static inline void blk_mq_cleanup_rq(struct request *rq)
 		rq->q->mq_ops->cleanup_rq(rq);
 }
 
+static inline void blk_rq_bio_prep(struct request *rq, struct bio *bio,
+		unsigned int nr_segs)
+{
+	rq->nr_phys_segments = nr_segs;
+	rq->__data_len = bio->bi_iter.bi_size;
+	rq->bio = rq->biotail = bio;
+	rq->ioprio = bio_prio(bio);
+
+	if (bio->bi_disk)
+		rq->rq_disk = bio->bi_disk;
+}
+
 blk_qc_t blk_mq_submit_bio(struct bio *bio);
 
 #endif
-- 
2.22.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 13/14] nvmet: use minimized version of blk_rq_append_bio
  2020-08-10 18:54 [PATCH 00/14] nvmet: passthru few fixes and improvements Chaitanya Kulkarni
                   ` (11 preceding siblings ...)
  2020-08-10 18:54 ` [PATCH 12/14] block: move blk_rq_bio_prep() to linux/blk-mq.h Chaitanya Kulkarni
@ 2020-08-10 18:54 ` Chaitanya Kulkarni
  2020-08-10 22:41   ` Logan Gunthorpe
  2020-08-10 18:54 ` [PATCH 14/14] nvmet: use inline bio for passthru fast path Chaitanya Kulkarni
  13 siblings, 1 reply; 53+ messages in thread
From: Chaitanya Kulkarni @ 2020-08-10 18:54 UTC (permalink / raw)
  To: linux-nvme; +Cc: hch, Chaitanya Kulkarni, sagi

The function blk_rq_append_bio() is a genereric API written for all
types driver (having bounce buffers) and different context (where
request is already having a bio i.e. rq->bio != NULL).

It does mainly three things: calculating the segments, bounce queue and
if req->bio == NULL call blk_rq_bio_prep() or handle low level merge()
case.

The NVMe PCIe driver does not use the queue bounce mechanism. In order
to find this out for each request processing in the passthru
blk_rq_append_bio() does extra work in the fast path for each request.

When I ran I/Os with different block sizes on the passthru controller
I found that we can reuse the req->sg_cnt instead of iterating over the
bvecs to find out nr_segs in blk_rq_append_bio(). This calculation in
blk_rq_append_bio() is a duplication of work given that we have the
value in req->sg_cnt. (correct me here if I'm wrong).

With NVMe passthru request based driver we allocate fresh request each
time, so every call to blk_rq_append_bio() rq->bio will be NULL i.e.
we don't really need the second condition in the blk_rq_append_bio()
and the resulting error condition in the caller of blk_rq_append_bio().

So for NVMeOF passthru driver recalculating the segments, bounce check
and ll_back_merge_code is not needed such that we can get away with the
minimal version of the blk_rq_append_bio() which removes the error check
in the fast path along with extra variable in nvmet_passthru_map_sg().

This patch updates the nvmet_passthru_map_sg() such that it does only
appending the bio to the request in the context of the NVMeOF Passthru
driver. Following are perf numbers :-

With current implementation (blk_rq_append_bio()) :-
----------------------------------------------------
+    5.80%     0.02%  kworker/0:2-mm_  [nvmet]  [k] nvmet_passthru_execute_cmd
+    5.44%     0.01%  kworker/0:2-mm_  [nvmet]  [k] nvmet_passthru_execute_cmd
+    4.88%     0.00%  kworker/0:2-mm_  [nvmet]  [k] nvmet_passthru_execute_cmd
+    5.44%     0.01%  kworker/0:2-mm_  [nvmet]  [k] nvmet_passthru_execute_cmd
+    4.86%     0.01%  kworker/0:2-mm_  [nvmet]  [k] nvmet_passthru_execute_cmd
+    5.17%     0.00%  kworker/0:2-eve  [nvmet]  [k] nvmet_passthru_execute_cmd

With this patch :-
----------------------------------------------------
+    3.14%     0.02%  kworker/0:2-eve  [nvmet]  [k] nvmet_passthru_execute_cmd
+    3.26%     0.01%  kworker/0:2-eve  [nvmet]  [k] nvmet_passthru_execute_cmd
+    5.37%     0.01%  kworker/0:2-mm_  [nvmet]  [k] nvmet_passthru_execute_cmd
+    5.18%     0.02%  kworker/0:2-eve  [nvmet]  [k] nvmet_passthru_execute_cmd
+    4.84%     0.02%  kworker/0:2-mm_  [nvmet]  [k] nvmet_passthru_execute_cmd
+    4.87%     0.01%  kworker/0:2-mm_  [nvmet]  [k] nvmet_passthru_execute_cmd

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/nvme/target/passthru.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/nvme/target/passthru.c b/drivers/nvme/target/passthru.c
index 5925e93c4682..a7ccc817ac2d 100644
--- a/drivers/nvme/target/passthru.c
+++ b/drivers/nvme/target/passthru.c
@@ -9,6 +9,7 @@
  */
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 #include <linux/module.h>
+#include <linux/blk-mq.h>
 
 #include "../host/nvme.h"
 #include "nvmet.h"
@@ -205,7 +206,7 @@ static int nvmet_passthru_map_sg(struct nvmet_req *req, struct request *rq)
 	int sg_cnt = req->sg_cnt;
 	struct scatterlist *sg;
 	struct bio *bio;
-	int i, ret;
+	int i;
 
 	if (nvme_is_write(req->cmd)) {
 		op_flags = REQ_SYNC | REQ_IDLE;
@@ -231,11 +232,7 @@ static int nvmet_passthru_map_sg(struct nvmet_req *req, struct request *rq)
 		sg_cnt--;
 	}
 
-	ret = blk_rq_append_bio(rq, &bio);
-	if (unlikely(ret)) {
-		bio_put(bio);
-		return ret;
-	}
+	blk_rq_bio_prep(rq, bio, req->sg_cnt);
 
 	return 0;
 }
-- 
2.22.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 14/14] nvmet: use inline bio for passthru fast path
  2020-08-10 18:54 [PATCH 00/14] nvmet: passthru few fixes and improvements Chaitanya Kulkarni
                   ` (12 preceding siblings ...)
  2020-08-10 18:54 ` [PATCH 13/14] nvmet: use minimized version of blk_rq_append_bio Chaitanya Kulkarni
@ 2020-08-10 18:54 ` Chaitanya Kulkarni
  2020-08-10 22:47   ` Logan Gunthorpe
  13 siblings, 1 reply; 53+ messages in thread
From: Chaitanya Kulkarni @ 2020-08-10 18:54 UTC (permalink / raw)
  To: linux-nvme; +Cc: hch, Chaitanya Kulkarni, sagi

In nvmet_passthru_execute_cmd() which is a high frequency function
it uses bio_alloc() which leads to memory allocation from the fs pool
for each I/O.

For NVMeoF nvmet_req we already have inline_bvec allocated as a part of
request allocation that can be used with preallocated bio when we
already know the size of request before bio allocation with bio_alloc(),
which we already do.

Introduce a bio member for the nvmet_req passthru anon union. In the
fast path, check if we can get away with inline bvec and bio from
nvmet_req with bio_init() call before actually allocating from the
bio_alloc().

This will be useful to avoid any new memory allocation under high
memory pressure situation and get rid of any extra work of
allocation (bio_alloc()) vs initialization (bio_init()) when
transfer len is < NVMET_MAX_INLINE_DATA_LEN that user can configure at
compile time.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/nvme/target/nvmet.h    |  1 +
 drivers/nvme/target/passthru.c | 21 ++++++++++++++++++---
 2 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h
index 47ee3fb193bd..3dd18f593472 100644
--- a/drivers/nvme/target/nvmet.h
+++ b/drivers/nvme/target/nvmet.h
@@ -331,6 +331,7 @@ struct nvmet_req {
 		} f;
 		struct {
 			struct request		*rq;
+			struct bio		inline_bio;
 			struct work_struct      work;
 			bool			use_workqueue;
 		} p;
diff --git a/drivers/nvme/target/passthru.c b/drivers/nvme/target/passthru.c
index a7ccc817ac2d..8b22a6c8f57c 100644
--- a/drivers/nvme/target/passthru.c
+++ b/drivers/nvme/target/passthru.c
@@ -200,6 +200,14 @@ static inline bool nvmet_write_cmd_group(__u8 opcode)
 	return ret;
 }
 
+static void nvmet_passthru_bio_done(struct bio *bio)
+{
+	struct nvmet_req *req = bio->bi_private;
+
+	if (bio != &req->p.inline_bio)
+		bio_put(bio);
+}
+
 static int nvmet_passthru_map_sg(struct nvmet_req *req, struct request *rq)
 {
 	unsigned int op_flags = 0;
@@ -219,14 +227,21 @@ static int nvmet_passthru_map_sg(struct nvmet_req *req, struct request *rq)
 	} else if (unlikely(req->cmd->common.opcode == nvme_cmd_flush))
 		op_flags = REQ_FUA;
 
-	bio = bio_alloc(GFP_KERNEL, min(sg_cnt, BIO_MAX_PAGES));
-	bio->bi_end_io = bio_put;
+	if (req->transfer_len <= NVMET_MAX_INLINE_DATA_LEN) {
+		bio = &req->p.inline_bio;
+		bio_init(bio, req->inline_bvec, ARRAY_SIZE(req->inline_bvec));
+	} else {
+		bio = bio_alloc(GFP_KERNEL, min(sg_cnt, BIO_MAX_PAGES));
+	}
+
+	bio->bi_end_io = nvmet_passthru_bio_done;
 	bio->bi_opf = req_op(rq) | op_flags;
+	bio->bi_private = req;
 
 	for_each_sg(req->sg, sg, req->sg_cnt, i) {
 		if (bio_add_pc_page(rq->q, bio, sg_page(sg), sg->length,
 				    sg->offset) < sg->length) {
-			bio_put(bio);
+			nvmet_passthru_bio_done(bio);
 			return -EINVAL;
 		}
 		sg_cnt--;
-- 
2.22.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [PATCH 06/14] nvmet: use unlikely for uncommon commands
  2020-08-10 18:54 ` [PATCH 06/14] nvmet: use unlikely for uncommon commands Chaitanya Kulkarni
@ 2020-08-10 20:46   ` Keith Busch
  2020-08-10 22:01     ` Logan Gunthorpe
  0 siblings, 1 reply; 53+ messages in thread
From: Keith Busch @ 2020-08-10 20:46 UTC (permalink / raw)
  To: Chaitanya Kulkarni; +Cc: hch, linux-nvme, sagi

On Mon, Aug 10, 2020 at 11:54:48AM -0700, Chaitanya Kulkarni wrote:
> I/O commands (nvme_cmd_read, nvme_cmd_write) are most common commands
> when accessing passthru controller, most controllers should not
> have set the effects for r/w I/O commnds (atleast I don't know at this
> moment).

The method that gets effects actually always returns 0 for IO commands,
which is the case if you are sending this command to a namespace.

>  	effects = nvme_command_effects(ctrl, ns, req->cmd->common.opcode);

In which case you could skip retrieving effects if ns != NULL.

> -	if (req->p.use_workqueue || effects) {
> +	if (unlikely(req->p.use_workqueue || effects)) {
>  		INIT_WORK(&req->p.work, nvmet_passthru_execute_cmd_work);
>  		req->p.rq = rq;
>  		schedule_work(&req->p.work);

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 08/14] nvmet: use nvme write cmd group setting op_flag
  2020-08-10 18:54 ` [PATCH 08/14] nvmet: use nvme write cmd group setting op_flag Chaitanya Kulkarni
@ 2020-08-10 21:11   ` Keith Busch
  2020-08-10 23:04     ` Chaitanya Kulkarni
  2020-08-10 22:14   ` Logan Gunthorpe
  1 sibling, 1 reply; 53+ messages in thread
From: Keith Busch @ 2020-08-10 21:11 UTC (permalink / raw)
  To: Chaitanya Kulkarni; +Cc: hch, linux-nvme, sagi

On Mon, Aug 10, 2020 at 11:54:50AM -0700, Chaitanya Kulkarni wrote:
> The function nvme_is_write() only checks the last bit of the opcode to
> determine if command is a write command or not based on that it sets
> the bio->bi_opf for passthru bio with REQ_SYNC | REQ_IDLE.

Are these flags even used? I'm not finding these getting copied to the
'struct request' in this command's setup, nor am I finding anything in
the blk_execute_rq() path that checks the bio's bi_opf.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 01/14] nvmet: for pt I/O cmds annotate req->sg_cnt likely
  2020-08-10 18:54 ` [PATCH 01/14] nvmet: for pt I/O cmds annotate req->sg_cnt likely Chaitanya Kulkarni
@ 2020-08-10 21:54   ` Logan Gunthorpe
  2020-08-10 22:54     ` Chaitanya Kulkarni
  0 siblings, 1 reply; 53+ messages in thread
From: Logan Gunthorpe @ 2020-08-10 21:54 UTC (permalink / raw)
  To: Chaitanya Kulkarni, linux-nvme; +Cc: hch, sagi

Hey,

You forgot to CC me again....

On 2020-08-10 12:54 p.m., Chaitanya Kulkarni wrote:
> I/O Commands (nvme_cmd_read, nvme_cmd_write) are most command commands
> when accessing passthru controller. For I/O commands data is represented
> in SG list, so sg_cnt != 0 in most cases.
> 
> Annotate req->sg_cnt check with likely.

Does adding these likely annotations actually produce any benefit? Why
can't we rely on the branch predictor in these instances?

I vaguely remember somebody complaining about all the extra likelys and
unlikelys which is why I removed them.

Logan

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 03/14] nvmet: use consistent type with id->nlbaf
  2020-08-10 18:54 ` [PATCH 03/14] nvmet: use consistent type with id->nlbaf Chaitanya Kulkarni
@ 2020-08-10 21:55   ` Logan Gunthorpe
  2020-08-10 22:56     ` Chaitanya Kulkarni
  0 siblings, 1 reply; 53+ messages in thread
From: Logan Gunthorpe @ 2020-08-10 21:55 UTC (permalink / raw)
  To: Chaitanya Kulkarni, linux-nvme; +Cc: hch, sagi



On 2020-08-10 12:54 p.m., Chaitanya Kulkarni wrote:
> In function nvmet_passthru_override_id_ns() while iterating
> over namespace lba format loop variable is declared as int that is
> inconsistent with the id->nlbaf type which is u8.
> 
> Make loop variable of the same type as id->nlbaf.

Why is this an improvement? Seems to me a loop variable should use the
fastest type (ie an int).

Logan

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 04/14] nvmet: use consistent type for op_flag
  2020-08-10 18:54 ` [PATCH 04/14] nvmet: use consistent type for op_flag Chaitanya Kulkarni
@ 2020-08-10 21:56   ` Logan Gunthorpe
  2020-08-10 22:56     ` Chaitanya Kulkarni
  0 siblings, 1 reply; 53+ messages in thread
From: Logan Gunthorpe @ 2020-08-10 21:56 UTC (permalink / raw)
  To: Chaitanya Kulkarni, linux-nvme; +Cc: hch, sagi



On 2020-08-10 12:54 p.m., Chaitanya Kulkarni wrote:
> In function nvmet_passthru_map_sg() we set the bio->bi_opf ored with
> req_op and op_flags set based on the NVMe cmd. The variable bio->bi_opf
> is declared as unsigned int use same type for op_flag.
> 
> Also, adjust the indentation according to new length.

I think you mean "order" (per reverse christmas-tree convention)... Not
"indentation".

> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>

Reviewed By: Logan Gunthorpe <logang@deltatee.com>

> ---
>  drivers/nvme/target/passthru.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/nvme/target/passthru.c b/drivers/nvme/target/passthru.c
> index 13d73f36e927..58377420874f 100644
> --- a/drivers/nvme/target/passthru.c
> +++ b/drivers/nvme/target/passthru.c
> @@ -180,9 +180,9 @@ static void nvmet_passthru_req_done(struct request *rq,
>  
>  static int nvmet_passthru_map_sg(struct nvmet_req *req, struct request *rq)
>  {
> +	unsigned int op_flags = 0;
>  	int sg_cnt = req->sg_cnt;
>  	struct scatterlist *sg;
> -	int op_flags = 0;
>  	struct bio *bio;
>  	int i, ret;
>  
> 

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 05/14] nvmet: get rid of the extra variable
  2020-08-10 18:54 ` [PATCH 05/14] nvmet: get rid of the extra variable Chaitanya Kulkarni
@ 2020-08-10 21:58   ` Logan Gunthorpe
  2020-08-10 23:00     ` Chaitanya Kulkarni
  0 siblings, 1 reply; 53+ messages in thread
From: Logan Gunthorpe @ 2020-08-10 21:58 UTC (permalink / raw)
  To: Chaitanya Kulkarni, linux-nvme; +Cc: hch, sagi



On 2020-08-10 12:54 p.m., Chaitanya Kulkarni wrote:
> In function nvmet_passthru_execute_cmd() the return value of the
> function nvmet_passthru_map_sg() is not used to determine stauts i.e.
> we can safely ignore the actual cause of error that is returned and just
> check for binary status pass or fail.
> 
> The prior check to this function call can be combined safely with the &&
> so that we can get rid of the extra variable and second if condition.
> 
> Remove the extra variable and adjust the code accordingly to have only
> one if condition.
> 
> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
> ---
>  drivers/nvme/target/passthru.c | 10 +++-------
>  1 file changed, 3 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/nvme/target/passthru.c b/drivers/nvme/target/passthru.c
> index 58377420874f..9a175b8b9d57 100644
> --- a/drivers/nvme/target/passthru.c
> +++ b/drivers/nvme/target/passthru.c
> @@ -221,7 +221,6 @@ static void nvmet_passthru_execute_cmd(struct nvmet_req *req)
>  	struct request *rq = NULL;
>  	u32 effects;
>  	u16 status;
> -	int ret;
>  
>  	if (likely(req->sq->qid != 0)) {
>  		u32 nsid = le32_to_cpu(req->cmd->common.nsid);
> @@ -242,12 +241,9 @@ static void nvmet_passthru_execute_cmd(struct nvmet_req *req)
>  		goto out_put_ns;
>  	}
>  
> -	if (likely(req->sg_cnt)) {
> -		ret = nvmet_passthru_map_sg(req, rq);
> -		if (unlikely(ret)) {
> -			status = NVME_SC_INTERNAL;
> -			goto out_put_req;
> -		}
> +	if (likely(req->sg_cnt) && unlikely(nvmet_passthru_map_sg(req, rq))) {
> +		status = NVME_SC_INTERNAL;
> +		goto out_put_req;
>  	}

IMO, that's really ugly and not at all an improvement.

Logan

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 06/14] nvmet: use unlikely for uncommon commands
  2020-08-10 20:46   ` Keith Busch
@ 2020-08-10 22:01     ` Logan Gunthorpe
  0 siblings, 0 replies; 53+ messages in thread
From: Logan Gunthorpe @ 2020-08-10 22:01 UTC (permalink / raw)
  To: Keith Busch, Chaitanya Kulkarni; +Cc: hch, linux-nvme, sagi



On 2020-08-10 2:46 p.m., Keith Busch wrote:
> On Mon, Aug 10, 2020 at 11:54:48AM -0700, Chaitanya Kulkarni wrote:
>> I/O commands (nvme_cmd_read, nvme_cmd_write) are most common commands
>> when accessing passthru controller, most controllers should not
>> have set the effects for r/w I/O commnds (atleast I don't know at this
>> moment).
> 
> The method that gets effects actually always returns 0 for IO commands,
> which is the case if you are sending this command to a namespace.
> 
>>  	effects = nvme_command_effects(ctrl, ns, req->cmd->common.opcode);
> 
> In which case you could skip retrieving effects if ns != NULL.

Is that a safe assumption going forward? If someone changes
nvme_command_effects are they going to know they need to change the
callsite?

Logan

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 07/14] nvmet: reuse pt-req-done for rq completion
  2020-08-10 18:54 ` [PATCH 07/14] nvmet: reuse pt-req-done for rq completion Chaitanya Kulkarni
@ 2020-08-10 22:10   ` Logan Gunthorpe
  2020-08-10 23:02     ` Chaitanya Kulkarni
  0 siblings, 1 reply; 53+ messages in thread
From: Logan Gunthorpe @ 2020-08-10 22:10 UTC (permalink / raw)
  To: Chaitanya Kulkarni, linux-nvme; +Cc: hch, sagi



On 2020-08-10 12:54 p.m., Chaitanya Kulkarni wrote:
> In the function nvmet_passthru_execute_cmd_work() after overriding
> id_[ctr|ns] results it completes the request by assigning the target
> request completion queue entry result from nvme request result,
> actually completing request on the transport and calling
> blk_mq_free_request().
> 
> The function nvmet_passhru_req_done does the same thing, use that and
> remove the duplicate code in nvmet_passhru_execute_cmd_work().
> 
> The last parameter for the nvmet_passthru_req_done() which is
> blk_status_t is not used since we derive the result from the nvme_req.

Christoph had specifically requested open coding this in earlier feedback:

https://lore.kernel.org/linux-block/20191010123406.GC28921@lst.de/

It was originally called nvmet_passthru_req_complete() and was open
coded at his request.

I think the way it was merged ended up being cleaner and easier to read
as well.

Logan

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 08/14] nvmet: use nvme write cmd group setting op_flag
  2020-08-10 18:54 ` [PATCH 08/14] nvmet: use nvme write cmd group setting op_flag Chaitanya Kulkarni
  2020-08-10 21:11   ` Keith Busch
@ 2020-08-10 22:14   ` Logan Gunthorpe
  2020-08-10 23:08     ` Chaitanya Kulkarni
  1 sibling, 1 reply; 53+ messages in thread
From: Logan Gunthorpe @ 2020-08-10 22:14 UTC (permalink / raw)
  To: Chaitanya Kulkarni, linux-nvme; +Cc: hch, sagi



On 2020-08-10 12:54 p.m., Chaitanya Kulkarni wrote:
> The function nvme_is_write() only checks the last bit of the opcode to
> determine if command is a write command or not based on that it sets
> the bio->bi_opf for passthru bio with REQ_SYNC | REQ_IDLE.
> 
> The condition is true for nvme_cmd_write (opcode 0x01), but there are
> commands present in the NVMe command set and are part for NVM Write
> Command Group that doesn't have a last bit set. e.g.
> nvme_cmd_write_zeroes 0x08. With current implementation for these
> commands we will never set the REQ_SYNC for bio->bi_opf even though
> command is from NVMe write command set.
> 
> For passthru backend introduce nvmet_write_cmd_group() and use the
> commands from NVMe 1.4 spec write command group to set the REQ_SYNC
> flag for passthru bio and update nvmet_passthru_map_sg accordingly.
> Since these are low frequency commands annotate condition with
> unlikely respectively.
> 
> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
> ---
>  drivers/nvme/target/passthru.c | 31 ++++++++++++++++++++++++++++---
>  1 file changed, 28 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/nvme/target/passthru.c b/drivers/nvme/target/passthru.c
> index 0b18038c44bb..5f06f4290c52 100644
> --- a/drivers/nvme/target/passthru.c
> +++ b/drivers/nvme/target/passthru.c
> @@ -176,6 +176,29 @@ static void nvmet_passthru_execute_cmd_work(struct work_struct *w)
>  	nvmet_passthru_req_done(rq, 0 /* this value is not used for request */);
>  }
>  
> +static inline bool nvmet_write_cmd_group(__u8 opcode)
> +{
> +	bool ret;
> +
> +	/* NVM Express Revision 1.4 Figure 462. */
> +	switch (opcode) {
> +	case nvme_admin_format_nvm:
> +	case nvme_admin_ns_attach:
> +	case nvme_admin_ns_mgmt:
> +	case nvme_admin_sanitize_nvm:
> +	case nvme_admin_security_send:

All of these admin commands are rejected as INVALID_OPCODE in
nvmet_parse_passthru_admin_cmd(). So adding them here seems confusing at
best.

Logan

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 11/14] nvmet: fix nvme module ref count Oops
  2020-08-10 18:54 ` [PATCH 11/14] nvmet: fix nvme module ref count Oops Chaitanya Kulkarni
@ 2020-08-10 22:39   ` Logan Gunthorpe
  2020-08-10 23:11     ` Chaitanya Kulkarni
  0 siblings, 1 reply; 53+ messages in thread
From: Logan Gunthorpe @ 2020-08-10 22:39 UTC (permalink / raw)
  To: Chaitanya Kulkarni, linux-nvme; +Cc: hch, sagi



On 2020-08-10 12:54 p.m., Chaitanya Kulkarni wrote:
> In the passthru controller enable path current code doesn't take the
> reference to the passthru ctrl module. Which produces following Oops :-

Please mention the steps required to reproduce the oops.

[snip]

> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 05aa568a60af..ba019c0b514b 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -4608,6 +4608,11 @@ struct nvme_ctrl *nvme_ctrl_get_by_path(const char *path)
>  
>  	ctrl = f->private_data;
>  	nvme_get_ctrl(ctrl);
> +	if (!try_module_get(ctrl->ops->module)) {
> +		pr_err("try_module_get failed\n");
> +		nvme_put_ctrl(ctrl);
> +		ctrl = ERR_PTR(-EINVAL);
> +	}
This looks racy. If the module goes away after the nvme_get_ctrl(), then
it will still crash.

But there might be a similar problem here with the core code. If a user
opens a char device, they get a pointer to the ctrl in private_data
without a reference. If the ctrl then goes away and they try to do an
ioctl, it will crash. Similarly, if the underlying driver module goes
away it will probably crash in the same way as this oops.

Looks to me like we need a reference taken to the ctrl and the module in
nvme_dev_open() (which are put in the file release). Then we know the
module can't go away inside nvme_ctrl_get_by_path().

>  out_close:
>  	filp_close(f, NULL);
> diff --git a/drivers/nvme/target/passthru.c b/drivers/nvme/target/passthru.c
> index 32f4951a1df7..5925e93c4682 100644
> --- a/drivers/nvme/target/passthru.c
> +++ b/drivers/nvme/target/passthru.c
> @@ -488,6 +488,7 @@ u16 nvmet_parse_passthru_admin_cmd(struct nvmet_req *req)
>  
>  static void nvme_pt_put_ctrl(struct nvme_ctrl *ctrl)
>  {
> +	module_put(ctrl->ops->module);
>  	nvme_put_ctrl(ctrl);
>  }

Given that this must be done for every caller of nvme_ctrl_get_by_path()
shouldn't this be in the core code?

Logan

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 13/14] nvmet: use minimized version of blk_rq_append_bio
  2020-08-10 18:54 ` [PATCH 13/14] nvmet: use minimized version of blk_rq_append_bio Chaitanya Kulkarni
@ 2020-08-10 22:41   ` Logan Gunthorpe
  0 siblings, 0 replies; 53+ messages in thread
From: Logan Gunthorpe @ 2020-08-10 22:41 UTC (permalink / raw)
  To: Chaitanya Kulkarni, linux-nvme; +Cc: hch, sagi



On 2020-08-10 12:54 p.m., Chaitanya Kulkarni wrote:
> The function blk_rq_append_bio() is a genereric API written for all
> types driver (having bounce buffers) and different context (where
> request is already having a bio i.e. rq->bio != NULL).
> 
> It does mainly three things: calculating the segments, bounce queue and
> if req->bio == NULL call blk_rq_bio_prep() or handle low level merge()
> case.
> 
> The NVMe PCIe driver does not use the queue bounce mechanism. In order
> to find this out for each request processing in the passthru
> blk_rq_append_bio() does extra work in the fast path for each request.
> 
> When I ran I/Os with different block sizes on the passthru controller
> I found that we can reuse the req->sg_cnt instead of iterating over the
> bvecs to find out nr_segs in blk_rq_append_bio(). This calculation in
> blk_rq_append_bio() is a duplication of work given that we have the
> value in req->sg_cnt. (correct me here if I'm wrong).
> 
> With NVMe passthru request based driver we allocate fresh request each
> time, so every call to blk_rq_append_bio() rq->bio will be NULL i.e.
> we don't really need the second condition in the blk_rq_append_bio()
> and the resulting error condition in the caller of blk_rq_append_bio().
> 
> So for NVMeOF passthru driver recalculating the segments, bounce check
> and ll_back_merge_code is not needed such that we can get away with the
> minimal version of the blk_rq_append_bio() which removes the error check
> in the fast path along with extra variable in nvmet_passthru_map_sg().
> 
> This patch updates the nvmet_passthru_map_sg() such that it does only
> appending the bio to the request in the context of the NVMeOF Passthru
> driver. Following are perf numbers :-
> 
> With current implementation (blk_rq_append_bio()) :-
> ----------------------------------------------------
> +    5.80%     0.02%  kworker/0:2-mm_  [nvmet]  [k] nvmet_passthru_execute_cmd
> +    5.44%     0.01%  kworker/0:2-mm_  [nvmet]  [k] nvmet_passthru_execute_cmd
> +    4.88%     0.00%  kworker/0:2-mm_  [nvmet]  [k] nvmet_passthru_execute_cmd
> +    5.44%     0.01%  kworker/0:2-mm_  [nvmet]  [k] nvmet_passthru_execute_cmd
> +    4.86%     0.01%  kworker/0:2-mm_  [nvmet]  [k] nvmet_passthru_execute_cmd
> +    5.17%     0.00%  kworker/0:2-eve  [nvmet]  [k] nvmet_passthru_execute_cmd
> 
> With this patch :-
> ----------------------------------------------------
> +    3.14%     0.02%  kworker/0:2-eve  [nvmet]  [k] nvmet_passthru_execute_cmd
> +    3.26%     0.01%  kworker/0:2-eve  [nvmet]  [k] nvmet_passthru_execute_cmd
> +    5.37%     0.01%  kworker/0:2-mm_  [nvmet]  [k] nvmet_passthru_execute_cmd
> +    5.18%     0.02%  kworker/0:2-eve  [nvmet]  [k] nvmet_passthru_execute_cmd
> +    4.84%     0.02%  kworker/0:2-mm_  [nvmet]  [k] nvmet_passthru_execute_cmd
> +    4.87%     0.01%  kworker/0:2-mm_  [nvmet]  [k] nvmet_passthru_execute_cmd
> 
> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>

Looks good to me. For this patch and the previous:

Reviewed-by: Logan Gunthorpe <logang@deltatee.com>

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 14/14] nvmet: use inline bio for passthru fast path
  2020-08-10 18:54 ` [PATCH 14/14] nvmet: use inline bio for passthru fast path Chaitanya Kulkarni
@ 2020-08-10 22:47   ` Logan Gunthorpe
  2020-08-10 23:14     ` Chaitanya Kulkarni
  0 siblings, 1 reply; 53+ messages in thread
From: Logan Gunthorpe @ 2020-08-10 22:47 UTC (permalink / raw)
  To: Chaitanya Kulkarni, linux-nvme; +Cc: hch, sagi



On 2020-08-10 12:54 p.m., Chaitanya Kulkarni wrote:
> In nvmet_passthru_execute_cmd() which is a high frequency function
> it uses bio_alloc() which leads to memory allocation from the fs pool
> for each I/O.
> 
> For NVMeoF nvmet_req we already have inline_bvec allocated as a part of
> request allocation that can be used with preallocated bio when we
> already know the size of request before bio allocation with bio_alloc(),
> which we already do.
> 
> Introduce a bio member for the nvmet_req passthru anon union. In the
> fast path, check if we can get away with inline bvec and bio from
> nvmet_req with bio_init() call before actually allocating from the
> bio_alloc().
> 
> This will be useful to avoid any new memory allocation under high
> memory pressure situation and get rid of any extra work of
> allocation (bio_alloc()) vs initialization (bio_init()) when
> transfer len is < NVMET_MAX_INLINE_DATA_LEN that user can configure at
> compile time.
> 
> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
> ---
>  drivers/nvme/target/nvmet.h    |  1 +
>  drivers/nvme/target/passthru.c | 21 ++++++++++++++++++---
>  2 files changed, 19 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h
> index 47ee3fb193bd..3dd18f593472 100644
> --- a/drivers/nvme/target/nvmet.h
> +++ b/drivers/nvme/target/nvmet.h
> @@ -331,6 +331,7 @@ struct nvmet_req {
>  		} f;
>  		struct {
>  			struct request		*rq;
> +			struct bio		inline_bio;
>  			struct work_struct      work;
>  			bool			use_workqueue;
>  		} p;
> diff --git a/drivers/nvme/target/passthru.c b/drivers/nvme/target/passthru.c
> index a7ccc817ac2d..8b22a6c8f57c 100644
> --- a/drivers/nvme/target/passthru.c
> +++ b/drivers/nvme/target/passthru.c
> @@ -200,6 +200,14 @@ static inline bool nvmet_write_cmd_group(__u8 opcode)
>  	return ret;
>  }
>  
> +static void nvmet_passthru_bio_done(struct bio *bio)
> +{
> +	struct nvmet_req *req = bio->bi_private;
> +
> +	if (bio != &req->p.inline_bio)
> +		bio_put(bio);
> +}
> +
>  static int nvmet_passthru_map_sg(struct nvmet_req *req, struct request *rq)
>  {
>  	unsigned int op_flags = 0;
> @@ -219,14 +227,21 @@ static int nvmet_passthru_map_sg(struct nvmet_req *req, struct request *rq)
>  	} else if (unlikely(req->cmd->common.opcode == nvme_cmd_flush))
>  		op_flags = REQ_FUA;
>  
> -	bio = bio_alloc(GFP_KERNEL, min(sg_cnt, BIO_MAX_PAGES));
> -	bio->bi_end_io = bio_put;
> +	if (req->transfer_len <= NVMET_MAX_INLINE_DATA_LEN) {
> +		bio = &req->p.inline_bio;
> +		bio_init(bio, req->inline_bvec, ARRAY_SIZE(req->inline_bvec));
> +	} else {
> +		bio = bio_alloc(GFP_KERNEL, min(sg_cnt, BIO_MAX_PAGES));
> +	}
> +
> +	bio->bi_end_io = nvmet_passthru_bio_done;

Why not just only set bi_end_io to bio_put if it's not an inline bio?
Then we don't need the extra bio done function and an extra if() on
every bio_put() call.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 01/14] nvmet: for pt I/O cmds annotate req->sg_cnt likely
  2020-08-10 21:54   ` Logan Gunthorpe
@ 2020-08-10 22:54     ` Chaitanya Kulkarni
  2020-08-10 22:55       ` Logan Gunthorpe
  0 siblings, 1 reply; 53+ messages in thread
From: Chaitanya Kulkarni @ 2020-08-10 22:54 UTC (permalink / raw)
  To: Logan Gunthorpe, linux-nvme; +Cc: hch, sagi

On 8/10/20 14:54, Logan Gunthorpe wrote:
> Hey,
> 
> You forgot to CC me again....
> 
Ugghhh..
> On 2020-08-10 12:54 p.m., Chaitanya Kulkarni wrote:
>> I/O Commands (nvme_cmd_read, nvme_cmd_write) are most command commands
>> when accessing passthru controller. For I/O commands data is represented
>> in SG list, so sg_cnt != 0 in most cases.
>>
>> Annotate req->sg_cnt check with likely.
> Does adding these likely annotations actually produce any benefit? Why
> can't we rely on the branch predictor in these instances?
> 

They are present in the target code. I don't see any problem having it 
in a passthru code unless it shows severe performance degradation.

> I vaguely remember somebody complaining about all the extra likelys and
> unlikelys which is why I removed them.
> 
> Logan
> 




_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 01/14] nvmet: for pt I/O cmds annotate req->sg_cnt likely
  2020-08-10 22:54     ` Chaitanya Kulkarni
@ 2020-08-10 22:55       ` Logan Gunthorpe
  2020-08-10 23:16         ` Chaitanya Kulkarni
  2020-08-14 10:33         ` hch
  0 siblings, 2 replies; 53+ messages in thread
From: Logan Gunthorpe @ 2020-08-10 22:55 UTC (permalink / raw)
  To: Chaitanya Kulkarni, linux-nvme; +Cc: hch, sagi



On 2020-08-10 4:54 p.m., Chaitanya Kulkarni wrote:
> On 8/10/20 14:54, Logan Gunthorpe wrote:
>> Hey,
>>
>> You forgot to CC me again....
>>
> Ugghhh..
>> On 2020-08-10 12:54 p.m., Chaitanya Kulkarni wrote:
>>> I/O Commands (nvme_cmd_read, nvme_cmd_write) are most command commands
>>> when accessing passthru controller. For I/O commands data is represented
>>> in SG list, so sg_cnt != 0 in most cases.
>>>
>>> Annotate req->sg_cnt check with likely.
>> Does adding these likely annotations actually produce any benefit? Why
>> can't we rely on the branch predictor in these instances?
>>
> 
> They are present in the target code. I don't see any problem having it 
> in a passthru code unless it shows severe performance degradation.

Isn't that backwards? Don't we only add performance optimizations (which
likely/unlikely are) when they show demonstrable increases in performance?

Logan

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 03/14] nvmet: use consistent type with id->nlbaf
  2020-08-10 21:55   ` Logan Gunthorpe
@ 2020-08-10 22:56     ` Chaitanya Kulkarni
  2020-08-10 22:57       ` Logan Gunthorpe
  0 siblings, 1 reply; 53+ messages in thread
From: Chaitanya Kulkarni @ 2020-08-10 22:56 UTC (permalink / raw)
  To: Logan Gunthorpe, linux-nvme; +Cc: hch, sagi

On 8/10/20 14:55, Logan Gunthorpe wrote:
> Why is this an improvement? Seems to me a loop variable should use the
> fastest type (ie an int).
> 
> Logan
> 
It is a type miss-match, why we need fast type ? its not in fast path.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 04/14] nvmet: use consistent type for op_flag
  2020-08-10 21:56   ` Logan Gunthorpe
@ 2020-08-10 22:56     ` Chaitanya Kulkarni
  0 siblings, 0 replies; 53+ messages in thread
From: Chaitanya Kulkarni @ 2020-08-10 22:56 UTC (permalink / raw)
  To: Logan Gunthorpe, linux-nvme; +Cc: hch, sagi

On 8/10/20 14:57, Logan Gunthorpe wrote:
> I think you mean "order" (per reverse christmas-tree convention)... Not
> "indentation".
Thanks.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 03/14] nvmet: use consistent type with id->nlbaf
  2020-08-10 22:56     ` Chaitanya Kulkarni
@ 2020-08-10 22:57       ` Logan Gunthorpe
  2020-08-10 23:20         ` Chaitanya Kulkarni
  0 siblings, 1 reply; 53+ messages in thread
From: Logan Gunthorpe @ 2020-08-10 22:57 UTC (permalink / raw)
  To: Chaitanya Kulkarni, linux-nvme; +Cc: hch, sagi



On 2020-08-10 4:56 p.m., Chaitanya Kulkarni wrote:
> On 8/10/20 14:55, Logan Gunthorpe wrote:
>> Why is this an improvement? Seems to me a loop variable should use the
>> fastest type (ie an int).
>>
>> Logan
>>
> It is a type miss-match, why we need fast type ? its not in fast path.

It's just normal to use an int for a loop variable instead of something
specific like a u8. The type mismatch isn't really an issue here...

Logan



_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 05/14] nvmet: get rid of the extra variable
  2020-08-10 21:58   ` Logan Gunthorpe
@ 2020-08-10 23:00     ` Chaitanya Kulkarni
  2020-08-10 23:13       ` Keith Busch
  0 siblings, 1 reply; 53+ messages in thread
From: Chaitanya Kulkarni @ 2020-08-10 23:00 UTC (permalink / raw)
  To: Logan Gunthorpe, linux-nvme; +Cc: hch, sagi

On 8/10/20 14:58, Logan Gunthorpe wrote:
> IMO, that's really ugly and not at all an improvement.
> 
> Logan

&& operator is perfectly readable when use for short circuit two 
conditions, more than 2 or newline is what ugly not this one.

There are several places in kernel this pattern is present, if it
was ugly that code have never been accepted by the kernel standards.


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 07/14] nvmet: reuse pt-req-done for rq completion
  2020-08-10 22:10   ` Logan Gunthorpe
@ 2020-08-10 23:02     ` Chaitanya Kulkarni
  2020-08-14 10:37       ` hch
  0 siblings, 1 reply; 53+ messages in thread
From: Chaitanya Kulkarni @ 2020-08-10 23:02 UTC (permalink / raw)
  To: Logan Gunthorpe, hch; +Cc: sagi, linux-nvme

On 8/10/20 15:10, Logan Gunthorpe wrote:
> Christoph had specifically requested open coding this in earlier feedback:
> 
> https://lore.kernel.org/linux-block/20191010123406.GC28921@lst.de/
> 
> It was originally called nvmet_passthru_req_complete() and was open
> coded at his request.
> 
> I think the way it was merged ended up being cleaner and easier to read
> as well.
> 
> Logan

I don't understand the reason for this ? Christoph can you please explain ?

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 08/14] nvmet: use nvme write cmd group setting op_flag
  2020-08-10 21:11   ` Keith Busch
@ 2020-08-10 23:04     ` Chaitanya Kulkarni
  2020-08-10 23:10       ` Keith Busch
  0 siblings, 1 reply; 53+ messages in thread
From: Chaitanya Kulkarni @ 2020-08-10 23:04 UTC (permalink / raw)
  To: Keith Busch; +Cc: hch, linux-nvme, sagi

On 8/10/20 14:11, Keith Busch wrote:
> Are these flags even used? I'm not finding these getting copied to the
> 'struct request' in this command's setup, nor am I finding anything in
> the blk_execute_rq() path that checks the bio's bi_opf.
> 

Good point, maybe they are needed for backward compatibility ?
Can we be consistent with the backends and different flags ?

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 08/14] nvmet: use nvme write cmd group setting op_flag
  2020-08-10 22:14   ` Logan Gunthorpe
@ 2020-08-10 23:08     ` Chaitanya Kulkarni
  2020-08-10 23:23       ` Logan Gunthorpe
  0 siblings, 1 reply; 53+ messages in thread
From: Chaitanya Kulkarni @ 2020-08-10 23:08 UTC (permalink / raw)
  To: Logan Gunthorpe, linux-nvme; +Cc: hch, sagi

>> +static inline bool nvmet_write_cmd_group(__u8 opcode)
>> +{
>> +	bool ret;
>> +
>> +	/* NVM Express Revision 1.4 Figure 462. */
>> +	switch (opcode) {
>> +	case nvme_admin_format_nvm:
>> +	case nvme_admin_ns_attach:
>> +	case nvme_admin_ns_mgmt:
>> +	case nvme_admin_sanitize_nvm:
>> +	case nvme_admin_security_send:
> 
> All of these admin commands are rejected as INVALID_OPCODE in
> nvmet_parse_passthru_admin_cmd(). So adding them here seems confusing at
> best.
> 

These are extremely important commands, why these commands are
rejected ?

What is the use of passthru if host cannot create namespaces format ns 
and sanitize the important data ?

Implementing Namespace management commands are in non-passthru mode are
NACKed if I remember correctly.

Irrespective of that it better have it documented there is no harm.
> Logan
> 


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 08/14] nvmet: use nvme write cmd group setting op_flag
  2020-08-10 23:04     ` Chaitanya Kulkarni
@ 2020-08-10 23:10       ` Keith Busch
  2020-08-10 23:23         ` Chaitanya Kulkarni
  0 siblings, 1 reply; 53+ messages in thread
From: Keith Busch @ 2020-08-10 23:10 UTC (permalink / raw)
  To: Chaitanya Kulkarni; +Cc: hch, linux-nvme, sagi

On Mon, Aug 10, 2020 at 11:04:36PM +0000, Chaitanya Kulkarni wrote:
> On 8/10/20 14:11, Keith Busch wrote:
> > Are these flags even used? I'm not finding these getting copied to the
> > 'struct request' in this command's setup, nor am I finding anything in
> > the blk_execute_rq() path that checks the bio's bi_opf.
> > 
> 
> Good point, maybe they are needed for backward compatibility ?
> Can we be consistent with the backends and different flags ?

I don't really follow what compatibility this could be about, but if
they're not used for REQ_OP_DRV_IN/OUT commands, as it appears to me to
be the case, you should just remove them rather than add more.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 11/14] nvmet: fix nvme module ref count Oops
  2020-08-10 22:39   ` Logan Gunthorpe
@ 2020-08-10 23:11     ` Chaitanya Kulkarni
  0 siblings, 0 replies; 53+ messages in thread
From: Chaitanya Kulkarni @ 2020-08-10 23:11 UTC (permalink / raw)
  To: Logan Gunthorpe, linux-nvme; +Cc: hch, sagi

On 8/10/20 15:39, Logan Gunthorpe wrote:
> 
> On 2020-08-10 12:54 p.m., Chaitanya Kulkarni wrote:
>> In the passthru controller enable path current code doesn't take the
>> reference to the passthru ctrl module. Which produces following Oops :-
> Please mention the steps required to reproduce the oops.
> 
> [snip]
> 
>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> index 05aa568a60af..ba019c0b514b 100644
>> --- a/drivers/nvme/host/core.c
>> +++ b/drivers/nvme/host/core.c
>> @@ -4608,6 +4608,11 @@ struct nvme_ctrl *nvme_ctrl_get_by_path(const char *path)
>>   
>>   	ctrl = f->private_data;
>>   	nvme_get_ctrl(ctrl);
>> +	if (!try_module_get(ctrl->ops->module)) {
>> +		pr_err("try_module_get failed\n");
>> +		nvme_put_ctrl(ctrl);
>> +		ctrl = ERR_PTR(-EINVAL);
>> +	}
> This looks racy. If the module goes away after the nvme_get_ctrl(), then
> it will still crash.
> 
Hmm, let me see what I can do.
> But there might be a similar problem here with the core code. If a user
> opens a char device, they get a pointer to the ctrl in private_data
> without a reference. If the ctrl then goes away and they try to do an
> ioctl, it will crash. Similarly, if the underlying driver module goes
> away it will probably crash in the same way as this oops.
> 
I think you are right, still need read code and make sure.
> Looks to me like we need a reference taken to the ctrl and the module in
> nvme_dev_open() (which are put in the file release). Then we know the
> module can't go away inside nvme_ctrl_get_by_path().
> 
Let me see for next version if that works.
>>   out_close:
>>   	filp_close(f, NULL);
>> diff --git a/drivers/nvme/target/passthru.c b/drivers/nvme/target/passthru.c
>> index 32f4951a1df7..5925e93c4682 100644
>> --- a/drivers/nvme/target/passthru.c
>> +++ b/drivers/nvme/target/passthru.c
>> @@ -488,6 +488,7 @@ u16 nvmet_parse_passthru_admin_cmd(struct nvmet_req *req)
>>   
>>   static void nvme_pt_put_ctrl(struct nvme_ctrl *ctrl)
>>   {
>> +	module_put(ctrl->ops->module);
>>   	nvme_put_ctrl(ctrl);
>>   }
> Given that this must be done for every caller of nvme_ctrl_get_by_path()
> shouldn't this be in the core code?
> 
Yes it should be.
> Logan
> 


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 05/14] nvmet: get rid of the extra variable
  2020-08-10 23:00     ` Chaitanya Kulkarni
@ 2020-08-10 23:13       ` Keith Busch
  2020-08-10 23:25         ` Chaitanya Kulkarni
  2020-08-10 23:32         ` Logan Gunthorpe
  0 siblings, 2 replies; 53+ messages in thread
From: Keith Busch @ 2020-08-10 23:13 UTC (permalink / raw)
  To: Chaitanya Kulkarni; +Cc: Logan Gunthorpe, hch, linux-nvme, sagi

On Mon, Aug 10, 2020 at 11:00:55PM +0000, Chaitanya Kulkarni wrote:
> On 8/10/20 14:58, Logan Gunthorpe wrote:
> > IMO, that's really ugly and not at all an improvement.
> > 
> > Logan
> 
> && operator is perfectly readable when use for short circuit two 
> conditions, more than 2 or newline is what ugly not this one.

The odd looking part is mixing "likely()" and "unlikely()" in the same
conditional.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 14/14] nvmet: use inline bio for passthru fast path
  2020-08-10 22:47   ` Logan Gunthorpe
@ 2020-08-10 23:14     ` Chaitanya Kulkarni
  0 siblings, 0 replies; 53+ messages in thread
From: Chaitanya Kulkarni @ 2020-08-10 23:14 UTC (permalink / raw)
  To: Logan Gunthorpe, linux-nvme; +Cc: hch, sagi

On 8/10/20 15:47, Logan Gunthorpe wrote:
>> +	}
>> +
>> +	bio->bi_end_io = nvmet_passthru_bio_done;
> Why not just only set bi_end_io to bio_put if it's not an inline bio?
> Then we don't need the extra bio done function and an extra if() on
> every bio_put() call.
> 

It is not a big difference, just keeping the code consistent in target.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 01/14] nvmet: for pt I/O cmds annotate req->sg_cnt likely
  2020-08-10 22:55       ` Logan Gunthorpe
@ 2020-08-10 23:16         ` Chaitanya Kulkarni
  2020-08-14 10:33         ` hch
  1 sibling, 0 replies; 53+ messages in thread
From: Chaitanya Kulkarni @ 2020-08-10 23:16 UTC (permalink / raw)
  To: Logan Gunthorpe, linux-nvme; +Cc: hch, sagi

On 8/10/20 15:56, Logan Gunthorpe wrote:
>> They are present in the target code. I don't see any problem having it
>> in a passthru code unless it shows severe performance degradation.
> Isn't that backwards? Don't we only add performance optimizations (which
> likely/unlikely are) when they show demonstrable increases in performance?
> 
> Logan
> 

We can always do micro-benchmark, in this case code, frequency and 
conditions are pretty clear and straight forward.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 03/14] nvmet: use consistent type with id->nlbaf
  2020-08-10 22:57       ` Logan Gunthorpe
@ 2020-08-10 23:20         ` Chaitanya Kulkarni
  0 siblings, 0 replies; 53+ messages in thread
From: Chaitanya Kulkarni @ 2020-08-10 23:20 UTC (permalink / raw)
  To: Logan Gunthorpe, linux-nvme; +Cc: hch, sagi

On 8/10/20 15:57, Logan Gunthorpe wrote:
>> It is a type miss-match, why we need fast type ? its not in fast path.
> It's just normal to use an int for a loop variable instead of something
> specific like a u8. The type mismatch isn't really an issue here...
> 
> Logan
> 
> 
> 

Just because it is norm that doesn't mean we should have type missmatch.

Also, what is it we are getting out of having type missmatch?

This may lead to some remote static analyzer generating type missmatch
warnings when someone tries to run proprietary code analysis tools.
This patch pro-actively avoids that.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 08/14] nvmet: use nvme write cmd group setting op_flag
  2020-08-10 23:10       ` Keith Busch
@ 2020-08-10 23:23         ` Chaitanya Kulkarni
  2020-08-10 23:26           ` Keith Busch
  0 siblings, 1 reply; 53+ messages in thread
From: Chaitanya Kulkarni @ 2020-08-10 23:23 UTC (permalink / raw)
  To: Keith Busch; +Cc: hch, linux-nvme, sagi

On 8/10/20 16:10, Keith Busch wrote:
>> Good point, maybe they are needed for backward compatibility ?
>> Can we be consistent with the backends and different flags ?
> I don't really follow what compatibility this could be about, but if
> they're not used for REQ_OP_DRV_IN/OUT commands, as it appears to me to
> be the case, you should just remove them rather than add more.
> 

Then we have to remove in bdev backend also ?

Sagi, Christoph please comment.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 08/14] nvmet: use nvme write cmd group setting op_flag
  2020-08-10 23:08     ` Chaitanya Kulkarni
@ 2020-08-10 23:23       ` Logan Gunthorpe
  2020-08-10 23:32         ` Chaitanya Kulkarni
  0 siblings, 1 reply; 53+ messages in thread
From: Logan Gunthorpe @ 2020-08-10 23:23 UTC (permalink / raw)
  To: Chaitanya Kulkarni, linux-nvme; +Cc: hch, sagi



On 2020-08-10 5:08 p.m., Chaitanya Kulkarni wrote:
>>> +static inline bool nvmet_write_cmd_group(__u8 opcode)
>>> +{
>>> +	bool ret;
>>> +
>>> +	/* NVM Express Revision 1.4 Figure 462. */
>>> +	switch (opcode) {
>>> +	case nvme_admin_format_nvm:
>>> +	case nvme_admin_ns_attach:
>>> +	case nvme_admin_ns_mgmt:
>>> +	case nvme_admin_sanitize_nvm:
>>> +	case nvme_admin_security_send:
>>
>> All of these admin commands are rejected as INVALID_OPCODE in
>> nvmet_parse_passthru_admin_cmd(). So adding them here seems confusing at
>> best.
>>
> 
> These are extremely important commands, why these commands are
> rejected ?

In some cases we were worried about side effects with other hosts
accessing the namespace, in others they require using the ctrl_id which
isn't really mapped in a sane way.

> What is the use of passthru if host cannot create namespaces format ns 
> and sanitize the important data ?

Plenty, they can format the ns on the native host and still use them on
other targets.

> Irrespective of that it better have it documented there is no harm.

We shouldn't have code that doesn't actually do anything. It's confusing
and bad practice.

Logan

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 05/14] nvmet: get rid of the extra variable
  2020-08-10 23:13       ` Keith Busch
@ 2020-08-10 23:25         ` Chaitanya Kulkarni
  2020-08-10 23:32         ` Logan Gunthorpe
  1 sibling, 0 replies; 53+ messages in thread
From: Chaitanya Kulkarni @ 2020-08-10 23:25 UTC (permalink / raw)
  To: Keith Busch; +Cc: Logan Gunthorpe, hch, linux-nvme, sagi

On 8/10/20 16:13, Keith Busch wrote:
> The odd looking part is mixing "likely()" and "unlikely()" in the same
> conditional.
> 

Actually The sg_cnt check can be moved into nvmet_passthru_map_sg() then 
we can get rid of the odd looking likely() and unlikely()
and have only one call.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 08/14] nvmet: use nvme write cmd group setting op_flag
  2020-08-10 23:23         ` Chaitanya Kulkarni
@ 2020-08-10 23:26           ` Keith Busch
  2020-08-10 23:33             ` Chaitanya Kulkarni
  0 siblings, 1 reply; 53+ messages in thread
From: Keith Busch @ 2020-08-10 23:26 UTC (permalink / raw)
  To: Chaitanya Kulkarni; +Cc: hch, linux-nvme, sagi

On Mon, Aug 10, 2020 at 11:23:03PM +0000, Chaitanya Kulkarni wrote:
> On 8/10/20 16:10, Keith Busch wrote:
> >> Good point, maybe they are needed for backward compatibility ?
> >> Can we be consistent with the backends and different flags ?
> > I don't really follow what compatibility this could be about, but if
> > they're not used for REQ_OP_DRV_IN/OUT commands, as it appears to me to
> > be the case, you should just remove them rather than add more.
> > 
> 
> Then we have to remove in bdev backend also ?

No. That target doesn't use REQ_OP_DRV_IN/OUT request ops through
blk_execute_rq(). That target uses REQ_OP_READ/WRITE bio ops through
submit_bio(). Big difference.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 08/14] nvmet: use nvme write cmd group setting op_flag
  2020-08-10 23:23       ` Logan Gunthorpe
@ 2020-08-10 23:32         ` Chaitanya Kulkarni
  2020-08-10 23:34           ` Logan Gunthorpe
  0 siblings, 1 reply; 53+ messages in thread
From: Chaitanya Kulkarni @ 2020-08-10 23:32 UTC (permalink / raw)
  To: Logan Gunthorpe, linux-nvme; +Cc: hch, sagi

On 8/10/20 16:23, Logan Gunthorpe wrote:
>> rejected ?
> In some cases we were worried about side effects with other hosts
> accessing the namespace, in others they require using the ctrl_id which
> isn't really mapped in a sane way.
> 

Thanks for the explanation. The first case is completely admin's 
responsibility how they want to use it and share it best we can provide
a documentation.

The second case we need to fix it, coz that defeats the whole purpose
of having passthru driver without ns-mgmt and critical commands which I
know are very important in the data centers.

>> What is the use of passthru if host cannot create namespaces format ns
>> and sanitize the important data ?
> Plenty, they can format the ns on the native host and still use them on
> other targets.
> 
>> Irrespective of that it better have it documented there is no harm.
> We shouldn't have code that doesn't actually do anything. It's confusing
> and bad practice.
> 

You are right.

These commands are important for data centers and should be supported.
Regarding this patch I'll remove these commands and I'll send a patch
when these admin commands are supported.

> Logan
> 


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 05/14] nvmet: get rid of the extra variable
  2020-08-10 23:13       ` Keith Busch
  2020-08-10 23:25         ` Chaitanya Kulkarni
@ 2020-08-10 23:32         ` Logan Gunthorpe
  1 sibling, 0 replies; 53+ messages in thread
From: Logan Gunthorpe @ 2020-08-10 23:32 UTC (permalink / raw)
  To: Keith Busch, Chaitanya Kulkarni; +Cc: hch, linux-nvme, sagi



On 2020-08-10 5:13 p.m., Keith Busch wrote:
> On Mon, Aug 10, 2020 at 11:00:55PM +0000, Chaitanya Kulkarni wrote:
>> On 8/10/20 14:58, Logan Gunthorpe wrote:
>>> IMO, that's really ugly and not at all an improvement.
>>>
>>> Logan
>>
>> && operator is perfectly readable when use for short circuit two 
>> conditions, more than 2 or newline is what ugly not this one.
> 
> The odd looking part is mixing "likely()" and "unlikely()" in the same
> conditional.
> 

Yes. Also, I've found kernel developers generally prefer functions with
long imperative names to not be used directly in an if statement. Just
search for 'if (ret)' in host/core.c to see many examples that could be
folded into the if statement, when they are not. Only functions with
names that are predicates, or answer questions, may be used in if
statements.

Logan

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 08/14] nvmet: use nvme write cmd group setting op_flag
  2020-08-10 23:26           ` Keith Busch
@ 2020-08-10 23:33             ` Chaitanya Kulkarni
  0 siblings, 0 replies; 53+ messages in thread
From: Chaitanya Kulkarni @ 2020-08-10 23:33 UTC (permalink / raw)
  To: Keith Busch; +Cc: hch, linux-nvme, sagi

On 8/10/20 16:26, Keith Busch wrote:
>> Then we have to remove in bdev backend also ?
> No. That target doesn't use REQ_OP_DRV_IN/OUT request ops through
> blk_execute_rq(). That target uses REQ_OP_READ/WRITE bio ops through
> submit_bio(). Big difference.
> 

Thanks for correcting my ignorance, probably then we should just remove 
it, will do in next version, thanks a lot for feedback.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 08/14] nvmet: use nvme write cmd group setting op_flag
  2020-08-10 23:32         ` Chaitanya Kulkarni
@ 2020-08-10 23:34           ` Logan Gunthorpe
  0 siblings, 0 replies; 53+ messages in thread
From: Logan Gunthorpe @ 2020-08-10 23:34 UTC (permalink / raw)
  To: Chaitanya Kulkarni, linux-nvme; +Cc: hch, sagi



On 2020-08-10 5:32 p.m., Chaitanya Kulkarni wrote:
> On 8/10/20 16:23, Logan Gunthorpe wrote:
>>> rejected ?
>> In some cases we were worried about side effects with other hosts
>> accessing the namespace, in others they require using the ctrl_id which
>> isn't really mapped in a sane way.
>>
> 
> Thanks for the explanation. The first case is completely admin's 
> responsibility how they want to use it and share it best we can provide
> a documentation.
> 
> The second case we need to fix it, coz that defeats the whole purpose
> of having passthru driver without ns-mgmt and critical commands which I
> know are very important in the data centers.

Then send patches and make the case that they are safe. Sagi was the
biggest opponent of allowing various admin commands.

Logan

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 01/14] nvmet: for pt I/O cmds annotate req->sg_cnt likely
  2020-08-10 22:55       ` Logan Gunthorpe
  2020-08-10 23:16         ` Chaitanya Kulkarni
@ 2020-08-14 10:33         ` hch
  1 sibling, 0 replies; 53+ messages in thread
From: hch @ 2020-08-14 10:33 UTC (permalink / raw)
  To: Logan Gunthorpe; +Cc: sagi, linux-nvme, Chaitanya Kulkarni, hch

On Mon, Aug 10, 2020 at 04:55:54PM -0600, Logan Gunthorpe wrote:
> 
> 
> On 2020-08-10 4:54 p.m., Chaitanya Kulkarni wrote:
> > On 8/10/20 14:54, Logan Gunthorpe wrote:
> >> Hey,
> >>
> >> You forgot to CC me again....
> >>
> > Ugghhh..
> >> On 2020-08-10 12:54 p.m., Chaitanya Kulkarni wrote:
> >>> I/O Commands (nvme_cmd_read, nvme_cmd_write) are most command commands
> >>> when accessing passthru controller. For I/O commands data is represented
> >>> in SG list, so sg_cnt != 0 in most cases.
> >>>
> >>> Annotate req->sg_cnt check with likely.
> >> Does adding these likely annotations actually produce any benefit? Why
> >> can't we rely on the branch predictor in these instances?
> >>
> > 
> > They are present in the target code. I don't see any problem having it 
> > in a passthru code unless it shows severe performance degradation.
> 
> Isn't that backwards? Don't we only add performance optimizations (which
> likely/unlikely are) when they show demonstrable increases in performance?

Yes, this looks like pretty premature optimization.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 07/14] nvmet: reuse pt-req-done for rq completion
  2020-08-10 23:02     ` Chaitanya Kulkarni
@ 2020-08-14 10:37       ` hch
  0 siblings, 0 replies; 53+ messages in thread
From: hch @ 2020-08-14 10:37 UTC (permalink / raw)
  To: Chaitanya Kulkarni; +Cc: Logan Gunthorpe, hch, linux-nvme, sagi

On Mon, Aug 10, 2020 at 11:02:01PM +0000, Chaitanya Kulkarni wrote:
> On 8/10/20 15:10, Logan Gunthorpe wrote:
> > Christoph had specifically requested open coding this in earlier feedback:
> > 
> > https://lore.kernel.org/linux-block/20191010123406.GC28921@lst.de/
> > 
> > It was originally called nvmet_passthru_req_complete() and was open
> > coded at his request.
> > 
> > I think the way it was merged ended up being cleaner and easier to read
> > as well.
> > 
> > Logan
> 
> I don't understand the reason for this ? Christoph can you please explain ?

Because this just happens to be a few lines of code that looks the
same.  But they are called from different context - one is a callback
and thus has an extra argument for example.  I don't see how sharing
that code really improves anything.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 09/14] nvmet: based on passthru req control use REQ_FUA
  2020-08-10 18:54 ` [PATCH 09/14] nvmet: based on passthru req control use REQ_FUA Chaitanya Kulkarni
@ 2020-08-14 10:37   ` Christoph Hellwig
  0 siblings, 0 replies; 53+ messages in thread
From: Christoph Hellwig @ 2020-08-14 10:37 UTC (permalink / raw)
  To: Chaitanya Kulkarni; +Cc: hch, linux-nvme, sagi

On Mon, Aug 10, 2020 at 11:54:51AM -0700, Chaitanya Kulkarni wrote:
> The write request for passthru driver currently does not consider
> nvme control field which could be set in the nvme_rw_command from host
> side.
> 
> Update op_flags with REQ_FUA when control field flag is set to
> NVME_RW_FUA.

REQ_FUA has no meaning for REQ_OP_DRV*

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2020-08-14 10:37 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-10 18:54 [PATCH 00/14] nvmet: passthru few fixes and improvements Chaitanya Kulkarni
2020-08-10 18:54 ` [PATCH 01/14] nvmet: for pt I/O cmds annotate req->sg_cnt likely Chaitanya Kulkarni
2020-08-10 21:54   ` Logan Gunthorpe
2020-08-10 22:54     ` Chaitanya Kulkarni
2020-08-10 22:55       ` Logan Gunthorpe
2020-08-10 23:16         ` Chaitanya Kulkarni
2020-08-14 10:33         ` hch
2020-08-10 18:54 ` [PATCH 02/14] nvmet: for pt I/O commands use likely for ns check Chaitanya Kulkarni
2020-08-10 18:54 ` [PATCH 03/14] nvmet: use consistent type with id->nlbaf Chaitanya Kulkarni
2020-08-10 21:55   ` Logan Gunthorpe
2020-08-10 22:56     ` Chaitanya Kulkarni
2020-08-10 22:57       ` Logan Gunthorpe
2020-08-10 23:20         ` Chaitanya Kulkarni
2020-08-10 18:54 ` [PATCH 04/14] nvmet: use consistent type for op_flag Chaitanya Kulkarni
2020-08-10 21:56   ` Logan Gunthorpe
2020-08-10 22:56     ` Chaitanya Kulkarni
2020-08-10 18:54 ` [PATCH 05/14] nvmet: get rid of the extra variable Chaitanya Kulkarni
2020-08-10 21:58   ` Logan Gunthorpe
2020-08-10 23:00     ` Chaitanya Kulkarni
2020-08-10 23:13       ` Keith Busch
2020-08-10 23:25         ` Chaitanya Kulkarni
2020-08-10 23:32         ` Logan Gunthorpe
2020-08-10 18:54 ` [PATCH 06/14] nvmet: use unlikely for uncommon commands Chaitanya Kulkarni
2020-08-10 20:46   ` Keith Busch
2020-08-10 22:01     ` Logan Gunthorpe
2020-08-10 18:54 ` [PATCH 07/14] nvmet: reuse pt-req-done for rq completion Chaitanya Kulkarni
2020-08-10 22:10   ` Logan Gunthorpe
2020-08-10 23:02     ` Chaitanya Kulkarni
2020-08-14 10:37       ` hch
2020-08-10 18:54 ` [PATCH 08/14] nvmet: use nvme write cmd group setting op_flag Chaitanya Kulkarni
2020-08-10 21:11   ` Keith Busch
2020-08-10 23:04     ` Chaitanya Kulkarni
2020-08-10 23:10       ` Keith Busch
2020-08-10 23:23         ` Chaitanya Kulkarni
2020-08-10 23:26           ` Keith Busch
2020-08-10 23:33             ` Chaitanya Kulkarni
2020-08-10 22:14   ` Logan Gunthorpe
2020-08-10 23:08     ` Chaitanya Kulkarni
2020-08-10 23:23       ` Logan Gunthorpe
2020-08-10 23:32         ` Chaitanya Kulkarni
2020-08-10 23:34           ` Logan Gunthorpe
2020-08-10 18:54 ` [PATCH 09/14] nvmet: based on passthru req control use REQ_FUA Chaitanya Kulkarni
2020-08-14 10:37   ` Christoph Hellwig
2020-08-10 18:54 ` [PATCH 10/14] nvmet: prep patch for pt ctrl put wrapper Chaitanya Kulkarni
2020-08-10 18:54 ` [PATCH 11/14] nvmet: fix nvme module ref count Oops Chaitanya Kulkarni
2020-08-10 22:39   ` Logan Gunthorpe
2020-08-10 23:11     ` Chaitanya Kulkarni
2020-08-10 18:54 ` [PATCH 12/14] block: move blk_rq_bio_prep() to linux/blk-mq.h Chaitanya Kulkarni
2020-08-10 18:54 ` [PATCH 13/14] nvmet: use minimized version of blk_rq_append_bio Chaitanya Kulkarni
2020-08-10 22:41   ` Logan Gunthorpe
2020-08-10 18:54 ` [PATCH 14/14] nvmet: use inline bio for passthru fast path Chaitanya Kulkarni
2020-08-10 22:47   ` Logan Gunthorpe
2020-08-10 23:14     ` Chaitanya Kulkarni

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).