linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/17] enable nvmet-fc for blktests
@ 2023-12-18 15:30 Daniel Wagner
  2023-12-18 15:30 ` [PATCH v3 01/16] nvmet: report ioccsz and iorcsz for disc ctrl Daniel Wagner
                   ` (18 more replies)
  0 siblings, 19 replies; 52+ messages in thread
From: Daniel Wagner @ 2023-12-18 15:30 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, Christoph Hellwig, Sagi Grimberg, Keith Busch,
	James Smart, Hannes Reinecke, Daniel Wagner

Another update on getting nvmet-fc ready for blktests. The main change here is
that I tried make sense of the ref count taking in nvmet-fc. When running
blktests with the auto connect udev rule activated the additional connect
attempt etc made nvmet-fc explode and choke everywhere. After a lot of poking
and pondering I decided to change the rules who the ref counts are taken for the
ctrl, association, target port and host port. This made a big difference and I
am able to get blktests pass the tests.

Also KASAN was reporting a lot of UAFs. There are still some problems left as I
can still observe hangers when running blktests in a loop for a while. But it
doesn't explode immediately so I consider this a win.

Apropos KASAN, it still reports the problem from [1], so anyone who want to run
this series needs to revert ee6fdc5055e9 ("nvme-fc: fix race between error
recovery and creating association").

The first four patches are independent of the rest and could go in sooner.

[1] https://lore.kernel.org/linux-nvme/hkhl56n665uvc6t5d6h3wtx7utkcorw4xlwi7d2t2bnonavhe6@xaan6pu43ap6/

changes:
v3:
 - collected all patches into one series
 - updated ref counting in nvmet-fc

v2:
  - added RBs
  - added new patches
  - https://lore.kernel.org/linux-nvme/20230620133711.22840-1-dwagner@suse.de/
  
v1:
  - https://lore.kernel.org/linux-nvme/20230615094356.14878-1-dwagner@suse.de/ 


Daniel Wagner (16):
  nvmet: report ioccsz and iorcsz for disc ctrl
  nvmet-fc: remove unnecessary bracket
  nvmet-trace: avoid dereferencing pointer too early
  nvmet-trace: null terminate device name string correctly
  nvmet-fcloop: Remove remote port from list when unlinking
  nvme-fc: Do not wait in vain when unloading module
  nvmet-fc: Release reference on target port
  nvmet-fc: untangle cross refcounting objects
  nvmet-fc: free queue and assoc directly
  nvmet-fc: hold reference on hostport match
  nvmet-fc: remove null hostport pointer check
  nvmet-fc: do not tack refs on tgtports from assoc
  nvmet-fc: abort command if when there is binding
  nvmet-fc: free hostport after release reference to tgtport
  nvmet-fc: avoid deadlock on delete association path
  nvmet-fc: take ref count on tgtport before delete assoc

 drivers/nvme/host/fc.c          |  20 +++--
 drivers/nvme/target/discovery.c |  13 +++
 drivers/nvme/target/fc.c        | 153 ++++++++++++++++++--------------
 drivers/nvme/target/fcloop.c    |   7 +-
 drivers/nvme/target/trace.c     |   6 +-
 drivers/nvme/target/trace.h     |  33 ++++---
 6 files changed, 135 insertions(+), 97 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v3 01/16] nvmet: report ioccsz and iorcsz for disc ctrl
  2023-12-18 15:30 [PATCH v3 00/17] enable nvmet-fc for blktests Daniel Wagner
@ 2023-12-18 15:30 ` Daniel Wagner
  2023-12-19  4:27   ` Christoph Hellwig
  2023-12-19  7:24   ` Hannes Reinecke
  2023-12-18 15:30 ` [PATCH v3 02/16] nvmet-fc: remove unnecessary bracket Daniel Wagner
                   ` (17 subsequent siblings)
  18 siblings, 2 replies; 52+ messages in thread
From: Daniel Wagner @ 2023-12-18 15:30 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, Christoph Hellwig, Sagi Grimberg, Keith Busch,
	James Smart, Hannes Reinecke, Daniel Wagner

The host started to verify the ioccsz and iorcsz values. I/O controllers
return valid values but not the discovery controllers. Use the same
values as for I/O controllers.

Fixes: 2fcd3ab39826 ("nvme-fabrics: check ioccsz and iorcsz")
Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/target/discovery.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/drivers/nvme/target/discovery.c b/drivers/nvme/target/discovery.c
index 668d257fa986..e3c4d247dd23 100644
--- a/drivers/nvme/target/discovery.c
+++ b/drivers/nvme/target/discovery.c
@@ -249,6 +249,7 @@ static void nvmet_execute_disc_identify(struct nvmet_req *req)
 {
 	struct nvmet_ctrl *ctrl = req->sq->ctrl;
 	struct nvme_id_ctrl *id;
+	u32 cmd_capsule_size;
 	u16 status = 0;
 
 	if (!nvmet_check_transfer_len(req, NVME_IDENTIFY_DATA_SIZE))
@@ -294,6 +295,18 @@ static void nvmet_execute_disc_identify(struct nvmet_req *req)
 
 	strscpy(id->subnqn, ctrl->subsys->subsysnqn, sizeof(id->subnqn));
 
+	/*
+	 * Max command capsule size is sqe + in-capsule data size.
+	 * Disable in-capsule data for Metadata capable controllers.
+	 */
+	cmd_capsule_size = sizeof(struct nvme_command);
+	if (!ctrl->pi_support)
+		cmd_capsule_size += req->port->inline_data_size;
+	id->ioccsz = cpu_to_le32(cmd_capsule_size / 16);
+
+	/* Max response capsule size is cqe */
+	id->iorcsz = cpu_to_le32(sizeof(struct nvme_completion) / 16);
+
 	status = nvmet_copy_to_sgl(req, 0, id, sizeof(*id));
 
 	kfree(id);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 02/16] nvmet-fc: remove unnecessary bracket
  2023-12-18 15:30 [PATCH v3 00/17] enable nvmet-fc for blktests Daniel Wagner
  2023-12-18 15:30 ` [PATCH v3 01/16] nvmet: report ioccsz and iorcsz for disc ctrl Daniel Wagner
@ 2023-12-18 15:30 ` Daniel Wagner
  2023-12-19  4:27   ` Christoph Hellwig
  2023-12-19  7:25   ` Hannes Reinecke
  2023-12-18 15:30 ` [PATCH v3 03/16] nvmet-trace: avoid dereferencing pointer too early Daniel Wagner
                   ` (16 subsequent siblings)
  18 siblings, 2 replies; 52+ messages in thread
From: Daniel Wagner @ 2023-12-18 15:30 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, Christoph Hellwig, Sagi Grimberg, Keith Busch,
	James Smart, Hannes Reinecke, Daniel Wagner

There is no need for the bracket around the identifier. Remove it.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/target/fc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c
index bd59990b5250..bda7a3009e85 100644
--- a/drivers/nvme/target/fc.c
+++ b/drivers/nvme/target/fc.c
@@ -1031,7 +1031,7 @@ nvmet_fc_match_hostport(struct nvmet_fc_tgtport *tgtport, void *hosthandle)
 	list_for_each_entry(host, &tgtport->host_list, host_list) {
 		if (host->hosthandle == hosthandle && !host->invalid) {
 			if (nvmet_fc_hostport_get(host))
-				return (host);
+				return host;
 		}
 	}
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 03/16] nvmet-trace: avoid dereferencing pointer too early
  2023-12-18 15:30 [PATCH v3 00/17] enable nvmet-fc for blktests Daniel Wagner
  2023-12-18 15:30 ` [PATCH v3 01/16] nvmet: report ioccsz and iorcsz for disc ctrl Daniel Wagner
  2023-12-18 15:30 ` [PATCH v3 02/16] nvmet-fc: remove unnecessary bracket Daniel Wagner
@ 2023-12-18 15:30 ` Daniel Wagner
  2023-12-19  4:29   ` Christoph Hellwig
  2023-12-18 15:30 ` [PATCH v3 04/16] nvmet-trace: null terminate device name string correctly Daniel Wagner
                   ` (15 subsequent siblings)
  18 siblings, 1 reply; 52+ messages in thread
From: Daniel Wagner @ 2023-12-18 15:30 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, Christoph Hellwig, Sagi Grimberg, Keith Busch,
	James Smart, Hannes Reinecke, Daniel Wagner

The first command issued from the host to the target is the fabrics
connect command. At this point, neither the target queue nor the
controller have been allocated. But we already try to trace this command
in nvmet_req_init.

Reported by KASAN.

Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/target/trace.c |  6 +++---
 drivers/nvme/target/trace.h | 28 +++++++++++++++++-----------
 2 files changed, 20 insertions(+), 14 deletions(-)

diff --git a/drivers/nvme/target/trace.c b/drivers/nvme/target/trace.c
index bff454d46255..6ee1f3db81d0 100644
--- a/drivers/nvme/target/trace.c
+++ b/drivers/nvme/target/trace.c
@@ -211,7 +211,7 @@ const char *nvmet_trace_disk_name(struct trace_seq *p, char *name)
 	return ret;
 }
 
-const char *nvmet_trace_ctrl_name(struct trace_seq *p, struct nvmet_ctrl *ctrl)
+const char *nvmet_trace_ctrl_id(struct trace_seq *p, u16 ctrl_id)
 {
 	const char *ret = trace_seq_buffer_ptr(p);
 
@@ -224,8 +224,8 @@ const char *nvmet_trace_ctrl_name(struct trace_seq *p, struct nvmet_ctrl *ctrl)
 	 * If we can know the extra data of the connect command in this stage,
 	 * we can update this print statement later.
 	 */
-	if (ctrl)
-		trace_seq_printf(p, "%d", ctrl->cntlid);
+	if (ctrl_id)
+		trace_seq_printf(p, "%d", ctrl_id);
 	else
 		trace_seq_printf(p, "_");
 	trace_seq_putc(p, 0);
diff --git a/drivers/nvme/target/trace.h b/drivers/nvme/target/trace.h
index 6109b3806b12..68f5317b1251 100644
--- a/drivers/nvme/target/trace.h
+++ b/drivers/nvme/target/trace.h
@@ -32,18 +32,24 @@ const char *nvmet_trace_parse_fabrics_cmd(struct trace_seq *p, u8 fctype,
 	 nvmet_trace_parse_nvm_cmd(p, opcode, cdw10) :			\
 	 nvmet_trace_parse_admin_cmd(p, opcode, cdw10)))
 
-const char *nvmet_trace_ctrl_name(struct trace_seq *p, struct nvmet_ctrl *ctrl);
-#define __print_ctrl_name(ctrl)				\
-	nvmet_trace_ctrl_name(p, ctrl)
+const char *nvmet_trace_ctrl_id(struct trace_seq *p, u16 ctrl_id);
+#define __print_ctrl_id(ctrl_id)			\
+	nvmet_trace_ctrl_id(p, ctrl_id)
 
 const char *nvmet_trace_disk_name(struct trace_seq *p, char *name);
 #define __print_disk_name(name)				\
 	nvmet_trace_disk_name(p, name)
 
 #ifndef TRACE_HEADER_MULTI_READ
-static inline struct nvmet_ctrl *nvmet_req_to_ctrl(struct nvmet_req *req)
+static inline u16 nvmet_req_to_ctrl_id(struct nvmet_req *req)
 {
-	return req->sq->ctrl;
+	/*
+	 * The queue and controller pointer are not valid until an association
+	 * has been established.
+	 */
+	if (!req->sq || !req->sq->ctrl)
+		return 0;
+	return req->sq->ctrl->cntlid;
 }
 
 static inline void __assign_req_name(char *name, struct nvmet_req *req)
@@ -63,7 +69,7 @@ TRACE_EVENT(nvmet_req_init,
 	TP_ARGS(req, cmd),
 	TP_STRUCT__entry(
 		__field(struct nvme_command *, cmd)
-		__field(struct nvmet_ctrl *, ctrl)
+		__field(u16, ctrl_id)
 		__array(char, disk, DISK_NAME_LEN)
 		__field(int, qid)
 		__field(u16, cid)
@@ -76,7 +82,7 @@ TRACE_EVENT(nvmet_req_init,
 	),
 	TP_fast_assign(
 		__entry->cmd = cmd;
-		__entry->ctrl = nvmet_req_to_ctrl(req);
+		__entry->ctrl_id = nvmet_req_to_ctrl_id(req);
 		__assign_req_name(__entry->disk, req);
 		__entry->qid = req->sq->qid;
 		__entry->cid = cmd->common.command_id;
@@ -90,7 +96,7 @@ TRACE_EVENT(nvmet_req_init,
 	),
 	TP_printk("nvmet%s: %sqid=%d, cmdid=%u, nsid=%u, flags=%#x, "
 		  "meta=%#llx, cmd=(%s, %s)",
-		__print_ctrl_name(__entry->ctrl),
+		__print_ctrl_id(__entry->ctrl_id),
 		__print_disk_name(__entry->disk),
 		__entry->qid, __entry->cid, __entry->nsid,
 		__entry->flags, __entry->metadata,
@@ -104,7 +110,7 @@ TRACE_EVENT(nvmet_req_complete,
 	TP_PROTO(struct nvmet_req *req),
 	TP_ARGS(req),
 	TP_STRUCT__entry(
-		__field(struct nvmet_ctrl *, ctrl)
+		__field(u16, ctrl_id)
 		__array(char, disk, DISK_NAME_LEN)
 		__field(int, qid)
 		__field(int, cid)
@@ -112,7 +118,7 @@ TRACE_EVENT(nvmet_req_complete,
 		__field(u16, status)
 	),
 	TP_fast_assign(
-		__entry->ctrl = nvmet_req_to_ctrl(req);
+		__entry->ctrl_id = nvmet_req_to_ctrl_id(req);
 		__entry->qid = req->cq->qid;
 		__entry->cid = req->cqe->command_id;
 		__entry->result = le64_to_cpu(req->cqe->result.u64);
@@ -120,7 +126,7 @@ TRACE_EVENT(nvmet_req_complete,
 		__assign_req_name(__entry->disk, req);
 	),
 	TP_printk("nvmet%s: %sqid=%d, cmdid=%u, res=%#llx, status=%#x",
-		__print_ctrl_name(__entry->ctrl),
+		__print_ctrl_id(__entry->ctrl_id),
 		__print_disk_name(__entry->disk),
 		__entry->qid, __entry->cid, __entry->result, __entry->status)
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 04/16] nvmet-trace: null terminate device name string correctly
  2023-12-18 15:30 [PATCH v3 00/17] enable nvmet-fc for blktests Daniel Wagner
                   ` (2 preceding siblings ...)
  2023-12-18 15:30 ` [PATCH v3 03/16] nvmet-trace: avoid dereferencing pointer too early Daniel Wagner
@ 2023-12-18 15:30 ` Daniel Wagner
  2023-12-19  4:29   ` Christoph Hellwig
  2023-12-18 15:30 ` [PATCH v3 05/16] nvmet-fcloop: Remove remote port from list when unlinking Daniel Wagner
                   ` (14 subsequent siblings)
  18 siblings, 1 reply; 52+ messages in thread
From: Daniel Wagner @ 2023-12-18 15:30 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, Christoph Hellwig, Sagi Grimberg, Keith Busch,
	James Smart, Hannes Reinecke, Daniel Wagner

strlen returns the string length excluding the null byte ('\0'), thus we
cut the last chars from the device name. While at it, switch snprintf to
ensure we always have properly terminated string.

Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/target/trace.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/target/trace.h b/drivers/nvme/target/trace.h
index 68f5317b1251..952e69f9737f 100644
--- a/drivers/nvme/target/trace.h
+++ b/drivers/nvme/target/trace.h
@@ -59,8 +59,9 @@ static inline void __assign_req_name(char *name, struct nvmet_req *req)
 		return;
 	}
 
-	strncpy(name, req->ns->device_path,
-		min_t(size_t, DISK_NAME_LEN, strlen(req->ns->device_path)));
+	snprintf(name,
+		 min_t(size_t, DISK_NAME_LEN, strlen(req->ns->device_path) + 1),
+		 "%s", req->ns->device_path);
 }
 #endif
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 05/16] nvmet-fcloop: Remove remote port from list when unlinking
  2023-12-18 15:30 [PATCH v3 00/17] enable nvmet-fc for blktests Daniel Wagner
                   ` (3 preceding siblings ...)
  2023-12-18 15:30 ` [PATCH v3 04/16] nvmet-trace: null terminate device name string correctly Daniel Wagner
@ 2023-12-18 15:30 ` Daniel Wagner
  2023-12-19  4:31   ` Christoph Hellwig
  2023-12-19  7:26   ` Hannes Reinecke
  2023-12-18 15:30 ` [PATCH v3 06/16] nvme-fc: Do not wait in vain when unloading module Daniel Wagner
                   ` (13 subsequent siblings)
  18 siblings, 2 replies; 52+ messages in thread
From: Daniel Wagner @ 2023-12-18 15:30 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, Christoph Hellwig, Sagi Grimberg, Keith Busch,
	James Smart, Hannes Reinecke, Daniel Wagner

The remote port is removed too late from fcloop_nports list. Remove it
when port is unregistered.

This prevents a busy loop in fcloop_exit, because it is possible the
remote port is found in the list and thus we will never progress.

The kernel log will be spammed with

  nvme_fcloop: fcloop_exit: Failed deleting remote port
  nvme_fcloop: fcloop_exit: Failed deleting target port

Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/target/fcloop.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/nvme/target/fcloop.c b/drivers/nvme/target/fcloop.c
index c65a73433c05..ead349af30f1 100644
--- a/drivers/nvme/target/fcloop.c
+++ b/drivers/nvme/target/fcloop.c
@@ -995,11 +995,6 @@ fcloop_nport_free(struct kref *ref)
 {
 	struct fcloop_nport *nport =
 		container_of(ref, struct fcloop_nport, ref);
-	unsigned long flags;
-
-	spin_lock_irqsave(&fcloop_lock, flags);
-	list_del(&nport->nport_list);
-	spin_unlock_irqrestore(&fcloop_lock, flags);
 
 	kfree(nport);
 }
@@ -1357,6 +1352,8 @@ __unlink_remote_port(struct fcloop_nport *nport)
 		nport->tport->remoteport = NULL;
 	nport->rport = NULL;
 
+	list_del(&nport->nport_list);
+
 	return rport;
 }
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 06/16] nvme-fc: Do not wait in vain when unloading module
  2023-12-18 15:30 [PATCH v3 00/17] enable nvmet-fc for blktests Daniel Wagner
                   ` (4 preceding siblings ...)
  2023-12-18 15:30 ` [PATCH v3 05/16] nvmet-fcloop: Remove remote port from list when unlinking Daniel Wagner
@ 2023-12-18 15:30 ` Daniel Wagner
  2023-12-19  4:35   ` Christoph Hellwig
  2023-12-19  7:28   ` Hannes Reinecke
  2023-12-18 15:30 ` [PATCH v3 07/16] nvmet-fc: Release reference on target port Daniel Wagner
                   ` (12 subsequent siblings)
  18 siblings, 2 replies; 52+ messages in thread
From: Daniel Wagner @ 2023-12-18 15:30 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, Christoph Hellwig, Sagi Grimberg, Keith Busch,
	James Smart, Hannes Reinecke, Daniel Wagner

The module unload code will wait for a controller to be delete even when
there is no controller and we wait for completion forever to happen.
Thus only wait for the completion when there is a controller which
needs to be removed.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/host/fc.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 15dc9dfe88a9..69f7943c5056 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -3947,10 +3947,11 @@ static int __init nvme_fc_init_module(void)
 	return ret;
 }
 
-static void
+static bool
 nvme_fc_delete_controllers(struct nvme_fc_rport *rport)
 {
 	struct nvme_fc_ctrl *ctrl;
+	bool cleanup = false;
 
 	spin_lock(&rport->lock);
 	list_for_each_entry(ctrl, &rport->ctrl_list, ctrl_list) {
@@ -3958,21 +3959,28 @@ nvme_fc_delete_controllers(struct nvme_fc_rport *rport)
 			"NVME-FC{%d}: transport unloading: deleting ctrl\n",
 			ctrl->cnum);
 		nvme_delete_ctrl(&ctrl->ctrl);
+		cleanup = true;
 	}
 	spin_unlock(&rport->lock);
+
+	return cleanup;
 }
 
-static void
+static bool
 nvme_fc_cleanup_for_unload(void)
 {
 	struct nvme_fc_lport *lport;
 	struct nvme_fc_rport *rport;
+	bool cleanup = false;
 
 	list_for_each_entry(lport, &nvme_fc_lport_list, port_list) {
 		list_for_each_entry(rport, &lport->endp_list, endp_list) {
-			nvme_fc_delete_controllers(rport);
+			if (nvme_fc_delete_controllers(rport))
+				cleanup = true;
 		}
 	}
+
+	return cleanup;
 }
 
 static void __exit nvme_fc_exit_module(void)
@@ -3982,10 +3990,8 @@ static void __exit nvme_fc_exit_module(void)
 
 	spin_lock_irqsave(&nvme_fc_lock, flags);
 	nvme_fc_waiting_to_unload = true;
-	if (!list_empty(&nvme_fc_lport_list)) {
-		need_cleanup = true;
-		nvme_fc_cleanup_for_unload();
-	}
+	if (!list_empty(&nvme_fc_lport_list))
+		need_cleanup = nvme_fc_cleanup_for_unload();
 	spin_unlock_irqrestore(&nvme_fc_lock, flags);
 	if (need_cleanup) {
 		pr_info("%s: waiting for ctlr deletes\n", __func__);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 07/16] nvmet-fc: Release reference on target port
  2023-12-18 15:30 [PATCH v3 00/17] enable nvmet-fc for blktests Daniel Wagner
                   ` (5 preceding siblings ...)
  2023-12-18 15:30 ` [PATCH v3 06/16] nvme-fc: Do not wait in vain when unloading module Daniel Wagner
@ 2023-12-18 15:30 ` Daniel Wagner
  2023-12-19  4:36   ` Christoph Hellwig
  2023-12-19  7:28   ` Hannes Reinecke
  2023-12-18 15:30 ` [PATCH v3 08/16] nvmet-fc: untangle cross refcounting objects Daniel Wagner
                   ` (11 subsequent siblings)
  18 siblings, 2 replies; 52+ messages in thread
From: Daniel Wagner @ 2023-12-18 15:30 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, Christoph Hellwig, Sagi Grimberg, Keith Busch,
	James Smart, Hannes Reinecke, Daniel Wagner

In case we return early out of __nvmet_fc_finish_ls_req() we still have
to release the reference on the target port.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/target/fc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c
index bda7a3009e85..28e432e62361 100644
--- a/drivers/nvme/target/fc.c
+++ b/drivers/nvme/target/fc.c
@@ -360,6 +360,7 @@ __nvmet_fc_finish_ls_req(struct nvmet_fc_ls_req_op *lsop)
 
 	if (!lsop->req_queued) {
 		spin_unlock_irqrestore(&tgtport->lock, flags);
+		nvmet_fc_tgtport_put(tgtport);
 		return;
 	}
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 08/16] nvmet-fc: untangle cross refcounting objects
  2023-12-18 15:30 [PATCH v3 00/17] enable nvmet-fc for blktests Daniel Wagner
                   ` (6 preceding siblings ...)
  2023-12-18 15:30 ` [PATCH v3 07/16] nvmet-fc: Release reference on target port Daniel Wagner
@ 2023-12-18 15:30 ` Daniel Wagner
  2023-12-19  5:16   ` Christoph Hellwig
  2023-12-19  7:59   ` Hannes Reinecke
  2023-12-18 15:30 ` [PATCH v3 09/16] nvmet-fc: free queue and assoc directly Daniel Wagner
                   ` (10 subsequent siblings)
  18 siblings, 2 replies; 52+ messages in thread
From: Daniel Wagner @ 2023-12-18 15:30 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, Christoph Hellwig, Sagi Grimberg, Keith Busch,
	James Smart, Hannes Reinecke, Daniel Wagner

Associations take a refcount on queues, queues take a refcount on
associations.

The existing code lead to the situation that the target executes a
disconnect and the host triggers a reconnect immediately. The reconnect
command still finds an existing association and uses this. Though the
reconnect crashes later on because nvmet_fc_delete_target_assoc()
blindly goes ahead and removes resources while the reconnect code wants
to use it. The problem is that nvmet_fc_find_target_assoc() is able to
lookup an association which is being removed.

So the first thing to address nvmet_fc_find_target_queue() is to remove
the association out of the list and wait a RCU cycle and free resources
in the free function callback of the kref_put().

The live time of the queues are strictly bound to the lifetime of an
association. Thus we don't need to take reverse refcounts (queue ->
association).

Furthermore, streamline the cleanup code by using the workqueue for
delete the association in nvmet_fc_ls_disconnect. This ensures, that we
run through the same shutdown path in all non error cases.

Reproducer: nvme/003

Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/target/fc.c | 83 +++++++++++++++++++---------------------
 1 file changed, 40 insertions(+), 43 deletions(-)

diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c
index 28e432e62361..db992df13c73 100644
--- a/drivers/nvme/target/fc.c
+++ b/drivers/nvme/target/fc.c
@@ -166,6 +166,7 @@ struct nvmet_fc_tgt_assoc {
 	struct nvmet_fc_hostport	*hostport;
 	struct nvmet_fc_ls_iod		*rcv_disconn;
 	struct list_head		a_list;
+	struct nvmet_fc_tgt_queue	*_queues[NVMET_NR_QUEUES + 1];
 	struct nvmet_fc_tgt_queue __rcu	*queues[NVMET_NR_QUEUES + 1];
 	struct kref			ref;
 	struct work_struct		del_work;
@@ -803,14 +804,11 @@ nvmet_fc_alloc_target_queue(struct nvmet_fc_tgt_assoc *assoc,
 	if (!queue)
 		return NULL;
 
-	if (!nvmet_fc_tgt_a_get(assoc))
-		goto out_free_queue;
-
 	queue->work_q = alloc_workqueue("ntfc%d.%d.%d", 0, 0,
 				assoc->tgtport->fc_target_port.port_num,
 				assoc->a_id, qid);
 	if (!queue->work_q)
-		goto out_a_put;
+		goto out_free_queue;
 
 	queue->qid = qid;
 	queue->sqsize = sqsize;
@@ -831,7 +829,8 @@ nvmet_fc_alloc_target_queue(struct nvmet_fc_tgt_assoc *assoc,
 	if (ret)
 		goto out_fail_iodlist;
 
-	WARN_ON(assoc->queues[qid]);
+	WARN_ON(assoc->_queues[qid]);
+	assoc->_queues[qid] = queue;
 	rcu_assign_pointer(assoc->queues[qid], queue);
 
 	return queue;
@@ -839,8 +838,6 @@ nvmet_fc_alloc_target_queue(struct nvmet_fc_tgt_assoc *assoc,
 out_fail_iodlist:
 	nvmet_fc_destroy_fcp_iodlist(assoc->tgtport, queue);
 	destroy_workqueue(queue->work_q);
-out_a_put:
-	nvmet_fc_tgt_a_put(assoc);
 out_free_queue:
 	kfree(queue);
 	return NULL;
@@ -853,12 +850,8 @@ nvmet_fc_tgt_queue_free(struct kref *ref)
 	struct nvmet_fc_tgt_queue *queue =
 		container_of(ref, struct nvmet_fc_tgt_queue, ref);
 
-	rcu_assign_pointer(queue->assoc->queues[queue->qid], NULL);
-
 	nvmet_fc_destroy_fcp_iodlist(queue->assoc->tgtport, queue);
 
-	nvmet_fc_tgt_a_put(queue->assoc);
-
 	destroy_workqueue(queue->work_q);
 
 	kfree_rcu(queue, rcu);
@@ -1173,13 +1166,18 @@ nvmet_fc_target_assoc_free(struct kref *ref)
 	struct nvmet_fc_tgtport *tgtport = assoc->tgtport;
 	struct nvmet_fc_ls_iod	*oldls;
 	unsigned long flags;
+	int i;
+
+	for (i = NVMET_NR_QUEUES; i >= 0; i--) {
+		if (assoc->_queues[i])
+			nvmet_fc_delete_target_queue(assoc->_queues[i]);
+	}
 
 	/* Send Disconnect now that all i/o has completed */
 	nvmet_fc_xmt_disconnect_assoc(assoc);
 
 	nvmet_fc_free_hostport(assoc->hostport);
 	spin_lock_irqsave(&tgtport->lock, flags);
-	list_del_rcu(&assoc->a_list);
 	oldls = assoc->rcv_disconn;
 	spin_unlock_irqrestore(&tgtport->lock, flags);
 	/* if pending Rcv Disconnect Association LS, send rsp now */
@@ -1209,7 +1207,7 @@ static void
 nvmet_fc_delete_target_assoc(struct nvmet_fc_tgt_assoc *assoc)
 {
 	struct nvmet_fc_tgtport *tgtport = assoc->tgtport;
-	struct nvmet_fc_tgt_queue *queue;
+	unsigned long flags;
 	int i, terminating;
 
 	terminating = atomic_xchg(&assoc->terminating, 1);
@@ -1218,29 +1216,25 @@ nvmet_fc_delete_target_assoc(struct nvmet_fc_tgt_assoc *assoc)
 	if (terminating)
 		return;
 
+	/* prevent new I/Os entering the queues */
+	for (i = NVMET_NR_QUEUES; i >= 0; i--)
+		rcu_assign_pointer(assoc->queues[i], NULL);
 
-	for (i = NVMET_NR_QUEUES; i >= 0; i--) {
-		rcu_read_lock();
-		queue = rcu_dereference(assoc->queues[i]);
-		if (!queue) {
-			rcu_read_unlock();
-			continue;
-		}
+	spin_lock_irqsave(&tgtport->lock, flags);
+	list_del_rcu(&assoc->a_list);
+	spin_unlock_irqrestore(&tgtport->lock, flags);
 
-		if (!nvmet_fc_tgt_q_get(queue)) {
-			rcu_read_unlock();
-			continue;
-		}
-		rcu_read_unlock();
-		nvmet_fc_delete_target_queue(queue);
-		nvmet_fc_tgt_q_put(queue);
+	synchronize_rcu();
+
+	/* ensure all in-flight I/Os have been processed */
+	for (i = NVMET_NR_QUEUES; i >= 0; i--) {
+		if (assoc->_queues[i])
+			flush_workqueue(assoc->_queues[i]->work_q);
 	}
 
 	dev_info(tgtport->dev,
 		"{%d:%d} Association deleted\n",
 		tgtport->fc_target_port.port_num, assoc->a_id);
-
-	nvmet_fc_tgt_a_put(assoc);
 }
 
 static struct nvmet_fc_tgt_assoc *
@@ -1493,9 +1487,8 @@ __nvmet_fc_free_assocs(struct nvmet_fc_tgtport *tgtport)
 	list_for_each_entry_rcu(assoc, &tgtport->assoc_list, a_list) {
 		if (!nvmet_fc_tgt_a_get(assoc))
 			continue;
-		if (!queue_work(nvmet_wq, &assoc->del_work))
-			/* already deleting - release local reference */
-			nvmet_fc_tgt_a_put(assoc);
+		queue_work(nvmet_wq, &assoc->del_work);
+		nvmet_fc_tgt_a_put(assoc);
 	}
 	rcu_read_unlock();
 }
@@ -1548,9 +1541,8 @@ nvmet_fc_invalidate_host(struct nvmet_fc_target_port *target_port,
 			continue;
 		assoc->hostport->invalid = 1;
 		noassoc = false;
-		if (!queue_work(nvmet_wq, &assoc->del_work))
-			/* already deleting - release local reference */
-			nvmet_fc_tgt_a_put(assoc);
+		queue_work(nvmet_wq, &assoc->del_work);
+		nvmet_fc_tgt_a_put(assoc);
 	}
 	spin_unlock_irqrestore(&tgtport->lock, flags);
 
@@ -1594,9 +1586,8 @@ nvmet_fc_delete_ctrl(struct nvmet_ctrl *ctrl)
 		nvmet_fc_tgtport_put(tgtport);
 
 		if (found_ctrl) {
-			if (!queue_work(nvmet_wq, &assoc->del_work))
-				/* already deleting - release local reference */
-				nvmet_fc_tgt_a_put(assoc);
+			queue_work(nvmet_wq, &assoc->del_work);
+			nvmet_fc_tgt_a_put(assoc);
 			return;
 		}
 
@@ -1626,6 +1617,8 @@ nvmet_fc_unregister_targetport(struct nvmet_fc_target_port *target_port)
 	/* terminate any outstanding associations */
 	__nvmet_fc_free_assocs(tgtport);
 
+	flush_workqueue(nvmet_wq);
+
 	/*
 	 * should terminate LS's as well. However, LS's will be generated
 	 * at the tail end of association termination, so they likely don't
@@ -1871,9 +1864,6 @@ nvmet_fc_ls_disconnect(struct nvmet_fc_tgtport *tgtport,
 				sizeof(struct fcnvme_ls_disconnect_assoc_acc)),
 			FCNVME_LS_DISCONNECT_ASSOC);
 
-	/* release get taken in nvmet_fc_find_target_assoc */
-	nvmet_fc_tgt_a_put(assoc);
-
 	/*
 	 * The rules for LS response says the response cannot
 	 * go back until ABTS's have been sent for all outstanding
@@ -1888,8 +1878,6 @@ nvmet_fc_ls_disconnect(struct nvmet_fc_tgtport *tgtport,
 	assoc->rcv_disconn = iod;
 	spin_unlock_irqrestore(&tgtport->lock, flags);
 
-	nvmet_fc_delete_target_assoc(assoc);
-
 	if (oldls) {
 		dev_info(tgtport->dev,
 			"{%d:%d} Multiple Disconnect Association LS's "
@@ -1905,6 +1893,9 @@ nvmet_fc_ls_disconnect(struct nvmet_fc_tgtport *tgtport,
 		nvmet_fc_xmt_ls_rsp(tgtport, oldls);
 	}
 
+	queue_work(nvmet_wq, &assoc->del_work);
+	nvmet_fc_tgt_a_put(assoc);
+
 	return false;
 }
 
@@ -2903,6 +2894,9 @@ nvmet_fc_remove_port(struct nvmet_port *port)
 
 	nvmet_fc_portentry_unbind(pe);
 
+	/* terminate any outstanding associations */
+	__nvmet_fc_free_assocs(pe->tgtport);
+
 	kfree(pe);
 }
 
@@ -2934,6 +2928,9 @@ static int __init nvmet_fc_init_module(void)
 
 static void __exit nvmet_fc_exit_module(void)
 {
+	/* ensure any shutdown operation, e.g. delete ctrls have finished */
+	flush_workqueue(nvmet_wq);
+
 	/* sanity check - all lports should be removed */
 	if (!list_empty(&nvmet_fc_target_list))
 		pr_warn("%s: targetport list not empty\n", __func__);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 09/16] nvmet-fc: free queue and assoc directly
  2023-12-18 15:30 [PATCH v3 00/17] enable nvmet-fc for blktests Daniel Wagner
                   ` (7 preceding siblings ...)
  2023-12-18 15:30 ` [PATCH v3 08/16] nvmet-fc: untangle cross refcounting objects Daniel Wagner
@ 2023-12-18 15:30 ` Daniel Wagner
  2023-12-19  7:59   ` Hannes Reinecke
  2023-12-18 15:30 ` [PATCH v3 10/16] nvmet-fc: hold reference on hostport match Daniel Wagner
                   ` (9 subsequent siblings)
  18 siblings, 1 reply; 52+ messages in thread
From: Daniel Wagner @ 2023-12-18 15:30 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, Christoph Hellwig, Sagi Grimberg, Keith Busch,
	James Smart, Hannes Reinecke, Daniel Wagner

Neither struct nvmet_fc_tgt_queue nor struct nvmet_fc_tgt_assoc are data
structure which are used in a RCU context. So there is no reason to
delay the free operation.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/target/fc.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c
index db992df13c73..0adf65154e27 100644
--- a/drivers/nvme/target/fc.c
+++ b/drivers/nvme/target/fc.c
@@ -145,7 +145,6 @@ struct nvmet_fc_tgt_queue {
 	struct list_head		avail_defer_list;
 	struct workqueue_struct		*work_q;
 	struct kref			ref;
-	struct rcu_head			rcu;
 	/* array of fcp_iods */
 	struct nvmet_fc_fcp_iod		fod[] __counted_by(sqsize);
 } __aligned(sizeof(unsigned long long));
@@ -170,7 +169,6 @@ struct nvmet_fc_tgt_assoc {
 	struct nvmet_fc_tgt_queue __rcu	*queues[NVMET_NR_QUEUES + 1];
 	struct kref			ref;
 	struct work_struct		del_work;
-	struct rcu_head			rcu;
 };
 
 
@@ -854,7 +852,7 @@ nvmet_fc_tgt_queue_free(struct kref *ref)
 
 	destroy_workqueue(queue->work_q);
 
-	kfree_rcu(queue, rcu);
+	kfree(queue);
 }
 
 static void
@@ -1187,8 +1185,8 @@ nvmet_fc_target_assoc_free(struct kref *ref)
 	dev_info(tgtport->dev,
 		"{%d:%d} Association freed\n",
 		tgtport->fc_target_port.port_num, assoc->a_id);
-	kfree_rcu(assoc, rcu);
 	nvmet_fc_tgtport_put(tgtport);
+	kfree(assoc);
 }
 
 static void
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 10/16] nvmet-fc: hold reference on hostport match
  2023-12-18 15:30 [PATCH v3 00/17] enable nvmet-fc for blktests Daniel Wagner
                   ` (8 preceding siblings ...)
  2023-12-18 15:30 ` [PATCH v3 09/16] nvmet-fc: free queue and assoc directly Daniel Wagner
@ 2023-12-18 15:30 ` Daniel Wagner
  2023-12-19  8:00   ` Hannes Reinecke
  2023-12-18 15:30 ` [PATCH v3 11/16] nvmet-fc: remove null hostport pointer check Daniel Wagner
                   ` (8 subsequent siblings)
  18 siblings, 1 reply; 52+ messages in thread
From: Daniel Wagner @ 2023-12-18 15:30 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, Christoph Hellwig, Sagi Grimberg, Keith Busch,
	James Smart, Hannes Reinecke, Daniel Wagner

The hostport data structure is shared between the association, this why
we keep track of the users via a refcount. So we should not decrement
the refcount on a match and free the hostport several times.

Reported by KASAN.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/target/fc.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c
index 0adf65154e27..fa7a6d2edd88 100644
--- a/drivers/nvme/target/fc.c
+++ b/drivers/nvme/target/fc.c
@@ -1070,8 +1070,6 @@ nvmet_fc_alloc_hostport(struct nvmet_fc_tgtport *tgtport, void *hosthandle)
 		/* new allocation not needed */
 		kfree(newhost);
 		newhost = match;
-		/* no new allocation - release reference */
-		nvmet_fc_tgtport_put(tgtport);
 	} else {
 		newhost->tgtport = tgtport;
 		newhost->hosthandle = hosthandle;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 11/16] nvmet-fc: remove null hostport pointer check
  2023-12-18 15:30 [PATCH v3 00/17] enable nvmet-fc for blktests Daniel Wagner
                   ` (9 preceding siblings ...)
  2023-12-18 15:30 ` [PATCH v3 10/16] nvmet-fc: hold reference on hostport match Daniel Wagner
@ 2023-12-18 15:30 ` Daniel Wagner
  2023-12-19 11:37   ` Hannes Reinecke
  2023-12-18 15:31 ` [PATCH v3 12/16] nvmet-fc: do not tack refs on tgtports from assoc Daniel Wagner
                   ` (7 subsequent siblings)
  18 siblings, 1 reply; 52+ messages in thread
From: Daniel Wagner @ 2023-12-18 15:30 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, Christoph Hellwig, Sagi Grimberg, Keith Busch,
	James Smart, Hannes Reinecke, Daniel Wagner

An association has always a valid hostport pointer. Remove useless
null pointer check.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/target/fc.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c
index fa7a6d2edd88..c243085d6f42 100644
--- a/drivers/nvme/target/fc.c
+++ b/drivers/nvme/target/fc.c
@@ -489,8 +489,7 @@ nvmet_fc_xmt_disconnect_assoc(struct nvmet_fc_tgt_assoc *assoc)
 	 * message is normal. Otherwise, send unless the hostport has
 	 * already been invalidated by the lldd.
 	 */
-	if (!tgtport->ops->ls_req || !assoc->hostport ||
-	    assoc->hostport->invalid)
+	if (!tgtport->ops->ls_req || assoc->hostport->invalid)
 		return;
 
 	lsop = kzalloc((sizeof(*lsop) +
@@ -1530,8 +1529,7 @@ nvmet_fc_invalidate_host(struct nvmet_fc_target_port *target_port,
 	spin_lock_irqsave(&tgtport->lock, flags);
 	list_for_each_entry_safe(assoc, next,
 				&tgtport->assoc_list, a_list) {
-		if (!assoc->hostport ||
-		    assoc->hostport->hosthandle != hosthandle)
+		if (assoc->hostport->hosthandle != hosthandle)
 			continue;
 		if (!nvmet_fc_tgt_a_get(assoc))
 			continue;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 12/16] nvmet-fc: do not tack refs on tgtports from assoc
  2023-12-18 15:30 [PATCH v3 00/17] enable nvmet-fc for blktests Daniel Wagner
                   ` (10 preceding siblings ...)
  2023-12-18 15:30 ` [PATCH v3 11/16] nvmet-fc: remove null hostport pointer check Daniel Wagner
@ 2023-12-18 15:31 ` Daniel Wagner
  2023-12-19 11:38   ` Hannes Reinecke
  2023-12-18 15:31 ` [PATCH v3 13/16] nvmet-fc: abort command if when there is binding Daniel Wagner
                   ` (6 subsequent siblings)
  18 siblings, 1 reply; 52+ messages in thread
From: Daniel Wagner @ 2023-12-18 15:31 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, Christoph Hellwig, Sagi Grimberg, Keith Busch,
	James Smart, Hannes Reinecke, Daniel Wagner

The association life time is tight to the life time of the target port.
That means we do not take extra a refcount when creating a association.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/target/fc.c | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c
index c243085d6f42..47cecc8c72b2 100644
--- a/drivers/nvme/target/fc.c
+++ b/drivers/nvme/target/fc.c
@@ -1109,12 +1109,9 @@ nvmet_fc_alloc_target_assoc(struct nvmet_fc_tgtport *tgtport, void *hosthandle)
 	if (idx < 0)
 		goto out_free_assoc;
 
-	if (!nvmet_fc_tgtport_get(tgtport))
-		goto out_ida;
-
 	assoc->hostport = nvmet_fc_alloc_hostport(tgtport, hosthandle);
 	if (IS_ERR(assoc->hostport))
-		goto out_put;
+		goto out_ida;
 
 	assoc->tgtport = tgtport;
 	assoc->a_id = idx;
@@ -1144,8 +1141,6 @@ nvmet_fc_alloc_target_assoc(struct nvmet_fc_tgtport *tgtport, void *hosthandle)
 
 	return assoc;
 
-out_put:
-	nvmet_fc_tgtport_put(tgtport);
 out_ida:
 	ida_free(&tgtport->assoc_cnt, idx);
 out_free_assoc:
@@ -1182,7 +1177,6 @@ nvmet_fc_target_assoc_free(struct kref *ref)
 	dev_info(tgtport->dev,
 		"{%d:%d} Association freed\n",
 		tgtport->fc_target_port.port_num, assoc->a_id);
-	nvmet_fc_tgtport_put(tgtport);
 	kfree(assoc);
 }
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 13/16] nvmet-fc: abort command if when there is binding
  2023-12-18 15:30 [PATCH v3 00/17] enable nvmet-fc for blktests Daniel Wagner
                   ` (11 preceding siblings ...)
  2023-12-18 15:31 ` [PATCH v3 12/16] nvmet-fc: do not tack refs on tgtports from assoc Daniel Wagner
@ 2023-12-18 15:31 ` Daniel Wagner
  2023-12-19 11:39   ` Hannes Reinecke
  2023-12-18 15:31 ` [PATCH v3 14/16] nvmet-fc: free hostport after release reference to tgtport Daniel Wagner
                   ` (5 subsequent siblings)
  18 siblings, 1 reply; 52+ messages in thread
From: Daniel Wagner @ 2023-12-18 15:31 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, Christoph Hellwig, Sagi Grimberg, Keith Busch,
	James Smart, Hannes Reinecke, Daniel Wagner

WHen the target port has not active port binding, there is no point in
trying to process the command as it has to fail anyway. Instead adding
checks to all commands abort the command early.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/target/fc.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c
index 47cecc8c72b2..663c51c9fe53 100644
--- a/drivers/nvme/target/fc.c
+++ b/drivers/nvme/target/fc.c
@@ -1101,6 +1101,9 @@ nvmet_fc_alloc_target_assoc(struct nvmet_fc_tgtport *tgtport, void *hosthandle)
 	int idx;
 	bool needrandom = true;
 
+	if (!tgtport->pe)
+		return NULL;
+
 	assoc = kzalloc(sizeof(*assoc), GFP_KERNEL);
 	if (!assoc)
 		return NULL;
@@ -2520,8 +2523,9 @@ nvmet_fc_handle_fcp_rqst(struct nvmet_fc_tgtport *tgtport,
 
 	fod->req.cmd = &fod->cmdiubuf.sqe;
 	fod->req.cqe = &fod->rspiubuf.cqe;
-	if (tgtport->pe)
-		fod->req.port = tgtport->pe->port;
+	if (!tgtport->pe)
+		goto transport_error;
+	fod->req.port = tgtport->pe->port;
 
 	/* clear any response payload */
 	memset(&fod->rspiubuf, 0, sizeof(fod->rspiubuf));
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 14/16] nvmet-fc: free hostport after release reference to tgtport
  2023-12-18 15:30 [PATCH v3 00/17] enable nvmet-fc for blktests Daniel Wagner
                   ` (12 preceding siblings ...)
  2023-12-18 15:31 ` [PATCH v3 13/16] nvmet-fc: abort command if when there is binding Daniel Wagner
@ 2023-12-18 15:31 ` Daniel Wagner
  2023-12-19 11:41   ` Hannes Reinecke
  2023-12-18 15:31 ` [PATCH v3 15/16] nvmet-fc: avoid deadlock on delete association path Daniel Wagner
                   ` (4 subsequent siblings)
  18 siblings, 1 reply; 52+ messages in thread
From: Daniel Wagner @ 2023-12-18 15:31 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, Christoph Hellwig, Sagi Grimberg, Keith Busch,
	James Smart, Hannes Reinecke, Daniel Wagner

Give the ref back before destroying the hostport object to prevent a
potential UAF.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/target/fc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c
index 663c51c9fe53..23d8779dc221 100644
--- a/drivers/nvme/target/fc.c
+++ b/drivers/nvme/target/fc.c
@@ -986,8 +986,8 @@ nvmet_fc_hostport_free(struct kref *ref)
 	spin_unlock_irqrestore(&tgtport->lock, flags);
 	if (tgtport->ops->host_release && hostport->invalid)
 		tgtport->ops->host_release(hostport->hosthandle);
-	kfree(hostport);
 	nvmet_fc_tgtport_put(tgtport);
+	kfree(hostport);
 }
 
 static void
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 15/16] nvmet-fc: avoid deadlock on delete association path
  2023-12-18 15:30 [PATCH v3 00/17] enable nvmet-fc for blktests Daniel Wagner
                   ` (13 preceding siblings ...)
  2023-12-18 15:31 ` [PATCH v3 14/16] nvmet-fc: free hostport after release reference to tgtport Daniel Wagner
@ 2023-12-18 15:31 ` Daniel Wagner
  2023-12-19  6:00   ` kernel test robot
                     ` (3 more replies)
  2023-12-18 15:31 ` [PATCH v3 16/16] nvmet-fc: take ref count on tgtport before delete assoc Daniel Wagner
                   ` (3 subsequent siblings)
  18 siblings, 4 replies; 52+ messages in thread
From: Daniel Wagner @ 2023-12-18 15:31 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, Christoph Hellwig, Sagi Grimberg, Keith Busch,
	James Smart, Hannes Reinecke, Daniel Wagner

When deleting an association the shutdown path is deadlocking because we
try to flush the nvmet_wq nested. Avoid this by deadlock by deferring
the put work into its own work item.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/target/fc.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c
index 23d8779dc221..30ba4ede333f 100644
--- a/drivers/nvme/target/fc.c
+++ b/drivers/nvme/target/fc.c
@@ -111,6 +111,8 @@ struct nvmet_fc_tgtport {
 	struct nvmet_fc_port_entry	*pe;
 	struct kref			ref;
 	u32				max_sg_cnt;
+
+	struct work_struct		put_work;
 };
 
 struct nvmet_fc_port_entry {
@@ -243,6 +245,13 @@ static LIST_HEAD(nvmet_fc_portentry_list);
 
 static void nvmet_fc_handle_ls_rqst_work(struct work_struct *work);
 static void nvmet_fc_fcp_rqst_op_defer_work(struct work_struct *work);
+static void nvmet_fc_put_tgtport_work(struct work_struct *work)
+{
+	struct nvmet_fc_tgtport *tgtport =
+		container_of(work, struct nvmet_fc_tgtport, put_work);
+
+	nvmet_fc_tgtport_put(tgtport);
+}
 static void nvmet_fc_tgt_a_put(struct nvmet_fc_tgt_assoc *assoc);
 static int nvmet_fc_tgt_a_get(struct nvmet_fc_tgt_assoc *assoc);
 static void nvmet_fc_tgt_q_put(struct nvmet_fc_tgt_queue *queue);
@@ -359,7 +368,7 @@ __nvmet_fc_finish_ls_req(struct nvmet_fc_ls_req_op *lsop)
 
 	if (!lsop->req_queued) {
 		spin_unlock_irqrestore(&tgtport->lock, flags);
-		nvmet_fc_tgtport_put(tgtport);
+		queue_work(nvmet_wq, &tgtport->put_work);
 		return;
 	}
 
@@ -373,7 +382,7 @@ __nvmet_fc_finish_ls_req(struct nvmet_fc_ls_req_op *lsop)
 				  (lsreq->rqstlen + lsreq->rsplen),
 				  DMA_BIDIRECTIONAL);
 
-	nvmet_fc_tgtport_put(tgtport);
+	queue_work(nvmet_wq, &tgtport->put_work);
 }
 
 static int
@@ -1402,6 +1411,7 @@ nvmet_fc_register_targetport(struct nvmet_fc_port_info *pinfo,
 	kref_init(&newrec->ref);
 	ida_init(&newrec->assoc_cnt);
 	newrec->max_sg_cnt = template->max_sgl_segments;
+	INIT_WORK(&newrec->put_work, nvmet_fc_put_tgtport_work);
 
 	ret = nvmet_fc_alloc_ls_iodlist(newrec);
 	if (ret) {
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 16/16] nvmet-fc: take ref count on tgtport before delete assoc
  2023-12-18 15:30 [PATCH v3 00/17] enable nvmet-fc for blktests Daniel Wagner
                   ` (14 preceding siblings ...)
  2023-12-18 15:31 ` [PATCH v3 15/16] nvmet-fc: avoid deadlock on delete association path Daniel Wagner
@ 2023-12-18 15:31 ` Daniel Wagner
  2023-12-19 11:47   ` Hannes Reinecke
  2023-12-18 15:31 ` [PATCH v3 17/17] nvmet-fc: add extensive debug logging Daniel Wagner
                   ` (2 subsequent siblings)
  18 siblings, 1 reply; 52+ messages in thread
From: Daniel Wagner @ 2023-12-18 15:31 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, Christoph Hellwig, Sagi Grimberg, Keith Busch,
	James Smart, Hannes Reinecke, Daniel Wagner

We have to ensure that the tgtport is not going away
before be have remove all the associations.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/target/fc.c | 31 +++++++++++++++++++++++--------
 1 file changed, 23 insertions(+), 8 deletions(-)

diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c
index 30ba4ede333f..455d35ef97eb 100644
--- a/drivers/nvme/target/fc.c
+++ b/drivers/nvme/target/fc.c
@@ -1092,13 +1092,28 @@ nvmet_fc_alloc_hostport(struct nvmet_fc_tgtport *tgtport, void *hosthandle)
 }
 
 static void
-nvmet_fc_delete_assoc(struct work_struct *work)
+nvmet_fc_delete_assoc(struct nvmet_fc_tgt_assoc *assoc)
+{
+	nvmet_fc_delete_target_assoc(assoc);
+	nvmet_fc_tgt_a_put(assoc);
+}
+
+static void
+nvmet_fc_delete_assoc_work(struct work_struct *work)
 {
 	struct nvmet_fc_tgt_assoc *assoc =
 		container_of(work, struct nvmet_fc_tgt_assoc, del_work);
+	struct nvmet_fc_tgtport *tgtport = assoc->tgtport;
 
-	nvmet_fc_delete_target_assoc(assoc);
-	nvmet_fc_tgt_a_put(assoc);
+	nvmet_fc_delete_assoc(assoc);
+	nvmet_fc_tgtport_put(tgtport);
+}
+
+static void
+nvmet_fc_schedule_delete_assoc(struct nvmet_fc_tgt_assoc *assoc)
+{
+	nvmet_fc_tgtport_get(assoc->tgtport);
+	queue_work(nvmet_wq, &assoc->del_work);
 }
 
 static struct nvmet_fc_tgt_assoc *
@@ -1129,7 +1144,7 @@ nvmet_fc_alloc_target_assoc(struct nvmet_fc_tgtport *tgtport, void *hosthandle)
 	assoc->a_id = idx;
 	INIT_LIST_HEAD(&assoc->a_list);
 	kref_init(&assoc->ref);
-	INIT_WORK(&assoc->del_work, nvmet_fc_delete_assoc);
+	INIT_WORK(&assoc->del_work, nvmet_fc_delete_assoc_work);
 	atomic_set(&assoc->terminating, 0);
 
 	while (needrandom) {
@@ -1489,7 +1504,7 @@ __nvmet_fc_free_assocs(struct nvmet_fc_tgtport *tgtport)
 	list_for_each_entry_rcu(assoc, &tgtport->assoc_list, a_list) {
 		if (!nvmet_fc_tgt_a_get(assoc))
 			continue;
-		queue_work(nvmet_wq, &assoc->del_work);
+		nvmet_fc_schedule_delete_assoc(assoc);
 		nvmet_fc_tgt_a_put(assoc);
 	}
 	rcu_read_unlock();
@@ -1542,7 +1557,7 @@ nvmet_fc_invalidate_host(struct nvmet_fc_target_port *target_port,
 			continue;
 		assoc->hostport->invalid = 1;
 		noassoc = false;
-		queue_work(nvmet_wq, &assoc->del_work);
+		nvmet_fc_schedule_delete_assoc(assoc);
 		nvmet_fc_tgt_a_put(assoc);
 	}
 	spin_unlock_irqrestore(&tgtport->lock, flags);
@@ -1587,7 +1602,7 @@ nvmet_fc_delete_ctrl(struct nvmet_ctrl *ctrl)
 		nvmet_fc_tgtport_put(tgtport);
 
 		if (found_ctrl) {
-			queue_work(nvmet_wq, &assoc->del_work);
+			nvmet_fc_schedule_delete_assoc(assoc);
 			nvmet_fc_tgt_a_put(assoc);
 			return;
 		}
@@ -1894,7 +1909,7 @@ nvmet_fc_ls_disconnect(struct nvmet_fc_tgtport *tgtport,
 		nvmet_fc_xmt_ls_rsp(tgtport, oldls);
 	}
 
-	queue_work(nvmet_wq, &assoc->del_work);
+	nvmet_fc_schedule_delete_assoc(assoc);
 	nvmet_fc_tgt_a_put(assoc);
 
 	return false;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 17/17] nvmet-fc: add extensive debug logging
  2023-12-18 15:30 [PATCH v3 00/17] enable nvmet-fc for blktests Daniel Wagner
                   ` (15 preceding siblings ...)
  2023-12-18 15:31 ` [PATCH v3 16/16] nvmet-fc: take ref count on tgtport before delete assoc Daniel Wagner
@ 2023-12-18 15:31 ` Daniel Wagner
  2023-12-19 11:54   ` Hannes Reinecke
  2023-12-18 16:10 ` [PATCH v3 00/17] enable nvmet-fc for blktests Maurizio Lombardi
  2024-01-02 21:36 ` Keith Busch
  18 siblings, 1 reply; 52+ messages in thread
From: Daniel Wagner @ 2023-12-18 15:31 UTC (permalink / raw)
  To: linux-nvme
  Cc: linux-kernel, Christoph Hellwig, Sagi Grimberg, Keith Busch,
	James Smart, Hannes Reinecke, Daniel Wagner

Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/target/configfs.c |   4 +
 drivers/nvme/target/core.c     |  13 ++++
 drivers/nvme/target/fc.c       | 132 +++++++++++++++++++++++++++++----
 3 files changed, 135 insertions(+), 14 deletions(-)

diff --git a/drivers/nvme/target/configfs.c b/drivers/nvme/target/configfs.c
index e307a044b1a1..ea05e8c62d4b 100644
--- a/drivers/nvme/target/configfs.c
+++ b/drivers/nvme/target/configfs.c
@@ -965,6 +965,7 @@ static int nvmet_port_subsys_allow_link(struct config_item *parent,
 			goto out_free_link;
 	}
 
+	pr_info("%s: %s\n", __func__, subsys->subsysnqn);
 	if (list_empty(&port->subsystems)) {
 		ret = nvmet_enable_port(port);
 		if (ret)
@@ -1050,6 +1051,7 @@ static int nvmet_allowed_hosts_allow_link(struct config_item *parent,
 		if (!strcmp(nvmet_host_name(p->host), nvmet_host_name(host)))
 			goto out_free_link;
 	}
+	pr_info("%s: adding hostnqn %s\n", __func__, nvmet_host_name(host));
 	list_add_tail(&link->entry, &subsys->hosts);
 	nvmet_subsys_disc_changed(subsys, host);
 
@@ -1879,6 +1881,8 @@ static struct config_group *nvmet_ports_make(struct config_group *group,
 	u16 portid;
 	u32 i;
 
+	pr_info("%s\n", __func__);
+
 	if (kstrtou16(name, 0, &portid))
 		return ERR_PTR(-EINVAL);
 
diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c
index 3935165048e7..4d5a9e4fcc9d 100644
--- a/drivers/nvme/target/core.c
+++ b/drivers/nvme/target/core.c
@@ -308,8 +308,11 @@ void nvmet_port_del_ctrls(struct nvmet_port *port, struct nvmet_subsys *subsys)
 {
 	struct nvmet_ctrl *ctrl;
 
+	pr_info("%s: subsys %s port %p\n", __func__, subsys->subsysnqn, port);
+
 	mutex_lock(&subsys->lock);
 	list_for_each_entry(ctrl, &subsys->ctrls, subsys_entry) {
+		pr_info("%s: ctrl %p ctrl->port %p\n", __func__, ctrl, ctrl->port);
 		if (ctrl->port == port)
 			ctrl->ops->delete_ctrl(ctrl);
 	}
@@ -1458,6 +1461,8 @@ u16 nvmet_alloc_ctrl(const char *subsysnqn, const char *hostnqn,
 	mutex_unlock(&subsys->lock);
 
 	*ctrlp = ctrl;
+
+	pr_info("%s: ctrl %p, subsysnqn %s hostnqn %s\n", __func__, ctrl, subsysnqn, hostnqn);
 	return 0;
 
 out_free_sqs:
@@ -1477,6 +1482,8 @@ static void nvmet_ctrl_free(struct kref *ref)
 	struct nvmet_ctrl *ctrl = container_of(ref, struct nvmet_ctrl, ref);
 	struct nvmet_subsys *subsys = ctrl->subsys;
 
+	pr_info("%s: ctrl %p %s\n", __func__, ctrl, ctrl->subsysnqn);
+
 	mutex_lock(&subsys->lock);
 	nvmet_release_p2p_ns_map(ctrl);
 	list_del(&ctrl->subsys_entry);
@@ -1550,6 +1557,8 @@ struct nvmet_subsys *nvmet_subsys_alloc(const char *subsysnqn,
 	char serial[NVMET_SN_MAX_SIZE / 2];
 	int ret;
 
+	pr_info("%s\n", __func__);
+
 	subsys = kzalloc(sizeof(*subsys), GFP_KERNEL);
 	if (!subsys)
 		return ERR_PTR(-ENOMEM);
@@ -1620,6 +1629,8 @@ static void nvmet_subsys_free(struct kref *ref)
 
 	WARN_ON_ONCE(!xa_empty(&subsys->namespaces));
 
+	pr_info("%s\n", __func__);
+
 	xa_destroy(&subsys->namespaces);
 	nvmet_passthru_subsys_free(subsys);
 
@@ -1633,6 +1644,8 @@ void nvmet_subsys_del_ctrls(struct nvmet_subsys *subsys)
 {
 	struct nvmet_ctrl *ctrl;
 
+	pr_info("%s\n", __func__);
+
 	mutex_lock(&subsys->lock);
 	list_for_each_entry(ctrl, &subsys->ctrls, subsys_entry)
 		ctrl->ops->delete_ctrl(ctrl);
diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c
index 455d35ef97eb..d50ff29697fc 100644
--- a/drivers/nvme/target/fc.c
+++ b/drivers/nvme/target/fc.c
@@ -242,6 +242,31 @@ static LIST_HEAD(nvmet_fc_target_list);
 static DEFINE_IDA(nvmet_fc_tgtport_cnt);
 static LIST_HEAD(nvmet_fc_portentry_list);
 
+static void __nvmet_fc_tgtport_put(struct nvmet_fc_tgtport *tgtport);
+static int __nvmet_fc_tgtport_get(struct nvmet_fc_tgtport *tgtport);
+
+#if 1
+#define nvmet_fc_tgtport_put(p)						\
+({									\
+	pr_info("nvmet_fc_tgtport_put: %p %d %s:%d\n",	\
+		p, atomic_read(&p->ref.refcount.refs),			\
+		__func__, __LINE__);					\
+	__nvmet_fc_tgtport_put(p);					\
+})
+
+#define nvmet_fc_tgtport_get(p)						\
+({									\
+	int ___r = __nvmet_fc_tgtport_get(p);				\
+									\
+	pr_info("nvmet_fc_tgtport_get: %p %d %s:%d\n",			\
+		p, atomic_read(&p->ref.refcount.refs),			\
+		__func__, __LINE__);					\
+	___r;								\
+})
+#else
+#define nvmet_fc_tgtport_put(p) __nvmet_fc_tgtport_put(p)
+#define nvmet_fc_tgtport_get(p) __nvmet_fc_tgtport_get(p)
+#endif
 
 static void nvmet_fc_handle_ls_rqst_work(struct work_struct *work);
 static void nvmet_fc_fcp_rqst_op_defer_work(struct work_struct *work);
@@ -252,12 +277,84 @@ static void nvmet_fc_put_tgtport_work(struct work_struct *work)
 
 	nvmet_fc_tgtport_put(tgtport);
 }
-static void nvmet_fc_tgt_a_put(struct nvmet_fc_tgt_assoc *assoc);
-static int nvmet_fc_tgt_a_get(struct nvmet_fc_tgt_assoc *assoc);
-static void nvmet_fc_tgt_q_put(struct nvmet_fc_tgt_queue *queue);
-static int nvmet_fc_tgt_q_get(struct nvmet_fc_tgt_queue *queue);
-static void nvmet_fc_tgtport_put(struct nvmet_fc_tgtport *tgtport);
-static int nvmet_fc_tgtport_get(struct nvmet_fc_tgtport *tgtport);
+static void __nvmet_fc_tgt_a_put(struct nvmet_fc_tgt_assoc *assoc);
+static int __nvmet_fc_tgt_a_get(struct nvmet_fc_tgt_assoc *assoc);
+
+#if 1
+#define nvmet_fc_tgt_a_put(a)						\
+({									\
+	pr_info("nvmet_fc_tgt_a_put: %d %d %s:%d \n",			\
+		a->a_id, atomic_read(&a->ref.refcount.refs),		\
+		__func__, __LINE__);					\
+	__nvmet_fc_tgt_a_put(a);					\
+})
+
+#define nvmet_fc_tgt_a_get(a)						\
+({									\
+	int ___r = __nvmet_fc_tgt_a_get(a);				\
+									\
+	pr_info("nvmet_fc_tgt_a_get: %d %d %s:%d\n",			\
+		a->a_id, atomic_read(&a->ref.refcount.refs),		\
+		__func__, __LINE__);					\
+	___r;								\
+})
+#else
+#define nvmet_fc_tgt_a_put(a) __nvmet_fc_tgt_a_put(a)
+#define nvmet_fc_tgt_a_get(a) __nvmet_fc_tgt_a_get(a)
+#endif
+
+static void __nvmet_fc_hostport_put(struct nvmet_fc_hostport *hostport);
+static int __nvmet_fc_hostport_get(struct nvmet_fc_hostport *hostport);
+
+#if 0
+#define nvmet_fc_hostport_put(p)					\
+({									\
+	pr_info("nvmet_fc_hostport_put: %p %d %s:%d\n",			\
+		p, atomic_read(&p->ref.refcount.refs),			\
+		__func__, __LINE__);					\
+	__nvmet_fc_hostport_put(p);					\
+})
+
+#define nvmet_fc_hostport_get(p)					\
+({									\
+	int ___r = __nvmet_fc_hostport_get(p);				\
+									\
+	pr_info("nvmet_fc_hostport_get: %p %d %s:%d\n",			\
+		p, atomic_read(&p->ref.refcount.refs),			\
+		__func__, __LINE__);					\
+	___r;								\
+})
+#else
+#define nvmet_fc_hostport_put(p) __nvmet_fc_hostport_put(p)
+#define nvmet_fc_hostport_get(p) __nvmet_fc_hostport_get(p)
+#endif
+
+static void __nvmet_fc_tgt_q_put(struct nvmet_fc_tgt_queue *queue);
+static int __nvmet_fc_tgt_q_get(struct nvmet_fc_tgt_queue *queue);
+
+#if 0
+#define nvmet_fc_tgt_q_put(q)						\
+({									\
+	pr_info("nvmet_fc_tgt_q_put: %p %d %s:%d\n",			\
+		q, atomic_read(&q->ref.refcount.refs),			\
+		__func__, __LINE__);					\
+	__nvmet_fc_tgt_q_put(q);					\
+})
+
+#define nvmet_fc_tgt_q_get(q)						\
+({									\
+	int ___r = __nvmet_fc_tgt_q_get(q);				\
+									\
+	pr_info("nvmet_fc_tgt_q_get: %p %d %s:%d\n",			\
+		q, atomic_read(&q->ref.refcount.refs),			\
+		__func__, __LINE__);					\
+	___r;								\
+})
+#else
+#define nvmet_fc_tgt_q_put(q) __nvmet_fc_tgt_q_put(q)
+#define nvmet_fc_tgt_q_get(q) __nvmet_fc_tgt_q_get(q)
+#endif
+
 static void nvmet_fc_handle_fcp_rqst(struct nvmet_fc_tgtport *tgtport,
 					struct nvmet_fc_fcp_iod *fod);
 static void nvmet_fc_delete_target_assoc(struct nvmet_fc_tgt_assoc *assoc);
@@ -864,13 +961,13 @@ nvmet_fc_tgt_queue_free(struct kref *ref)
 }
 
 static void
-nvmet_fc_tgt_q_put(struct nvmet_fc_tgt_queue *queue)
+__nvmet_fc_tgt_q_put(struct nvmet_fc_tgt_queue *queue)
 {
 	kref_put(&queue->ref, nvmet_fc_tgt_queue_free);
 }
 
 static int
-nvmet_fc_tgt_q_get(struct nvmet_fc_tgt_queue *queue)
+__nvmet_fc_tgt_q_get(struct nvmet_fc_tgt_queue *queue)
 {
 	return kref_get_unless_zero(&queue->ref);
 }
@@ -1000,13 +1097,13 @@ nvmet_fc_hostport_free(struct kref *ref)
 }
 
 static void
-nvmet_fc_hostport_put(struct nvmet_fc_hostport *hostport)
+__nvmet_fc_hostport_put(struct nvmet_fc_hostport *hostport)
 {
 	kref_put(&hostport->ref, nvmet_fc_hostport_free);
 }
 
 static int
-nvmet_fc_hostport_get(struct nvmet_fc_hostport *hostport)
+__nvmet_fc_hostport_get(struct nvmet_fc_hostport *hostport)
 {
 	return kref_get_unless_zero(&hostport->ref);
 }
@@ -1208,13 +1305,13 @@ nvmet_fc_target_assoc_free(struct kref *ref)
 }
 
 static void
-nvmet_fc_tgt_a_put(struct nvmet_fc_tgt_assoc *assoc)
+__nvmet_fc_tgt_a_put(struct nvmet_fc_tgt_assoc *assoc)
 {
 	kref_put(&assoc->ref, nvmet_fc_target_assoc_free);
 }
 
 static int
-nvmet_fc_tgt_a_get(struct nvmet_fc_tgt_assoc *assoc)
+__nvmet_fc_tgt_a_get(struct nvmet_fc_tgt_assoc *assoc)
 {
 	return kref_get_unless_zero(&assoc->ref);
 }
@@ -1441,6 +1538,7 @@ nvmet_fc_register_targetport(struct nvmet_fc_port_info *pinfo,
 	spin_unlock_irqrestore(&nvmet_fc_tgtlock, flags);
 
 	*portptr = &newrec->fc_target_port;
+	pr_info("%s: targetport %p\n", __func__, newrec);
 	return 0;
 
 out_free_newrec:
@@ -1484,13 +1582,13 @@ nvmet_fc_free_tgtport(struct kref *ref)
 }
 
 static void
-nvmet_fc_tgtport_put(struct nvmet_fc_tgtport *tgtport)
+__nvmet_fc_tgtport_put(struct nvmet_fc_tgtport *tgtport)
 {
 	kref_put(&tgtport->ref, nvmet_fc_free_tgtport);
 }
 
 static int
-nvmet_fc_tgtport_get(struct nvmet_fc_tgtport *tgtport)
+__nvmet_fc_tgtport_get(struct nvmet_fc_tgtport *tgtport)
 {
 	return kref_get_unless_zero(&tgtport->ref);
 }
@@ -1580,6 +1678,8 @@ nvmet_fc_delete_ctrl(struct nvmet_ctrl *ctrl)
 	unsigned long flags;
 	bool found_ctrl = false;
 
+	pr_info("%s: ctrl %p\n", __func__, ctrl);
+
 	/* this is a bit ugly, but don't want to make locks layered */
 	spin_lock_irqsave(&nvmet_fc_tgtlock, flags);
 	list_for_each_entry_safe(tgtport, next, &nvmet_fc_target_list,
@@ -1591,6 +1691,8 @@ nvmet_fc_delete_ctrl(struct nvmet_ctrl *ctrl)
 		rcu_read_lock();
 		list_for_each_entry_rcu(assoc, &tgtport->assoc_list, a_list) {
 			queue = rcu_dereference(assoc->queues[0]);
+			pr_info("%s: queue %p nvme_sq.ctrl %p\n",
+				__func__, queue, queue ? queue->nvme_sq.ctrl : NULL);
 			if (queue && queue->nvme_sq.ctrl == ctrl) {
 				if (nvmet_fc_tgt_a_get(assoc))
 					found_ctrl = true;
@@ -1628,6 +1730,8 @@ nvmet_fc_unregister_targetport(struct nvmet_fc_target_port *target_port)
 {
 	struct nvmet_fc_tgtport *tgtport = targetport_to_tgtport(target_port);
 
+	pr_info("%s\n", __func__);
+
 	nvmet_fc_portentry_unbind_tgt(tgtport);
 
 	/* terminate any outstanding associations */
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 00/17] enable nvmet-fc for blktests
  2023-12-18 15:30 [PATCH v3 00/17] enable nvmet-fc for blktests Daniel Wagner
                   ` (16 preceding siblings ...)
  2023-12-18 15:31 ` [PATCH v3 17/17] nvmet-fc: add extensive debug logging Daniel Wagner
@ 2023-12-18 16:10 ` Maurizio Lombardi
  2024-01-02 21:36 ` Keith Busch
  18 siblings, 0 replies; 52+ messages in thread
From: Maurizio Lombardi @ 2023-12-18 16:10 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: linux-nvme, linux-kernel, Christoph Hellwig, Sagi Grimberg,
	Keith Busch, James Smart, Hannes Reinecke

po 18. 12. 2023 v 16:34 odesílatel Daniel Wagner <dwagner@suse.de> napsal:
>
> Apropos KASAN, it still reports the problem from [1], so anyone who want to run
> this series needs to revert ee6fdc5055e9 ("nvme-fc: fix race between error
> recovery and creating association").

We hit this regression in RHEL too and we were forced to revert that
commit, it's obviously buggy
because it calls blocking functions with interrupts disabled.
Please revert it.

Note: I have tried to fix it and close the race condition but it all
became a bit too complex, also,
I didn't have the opportunity to test it yet.
http://bsdbackstore.eu/misc/0001-nvme-fc-fix-races-and-scheduling-while-atomic-bugs.patch

Maurizio


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 01/16] nvmet: report ioccsz and iorcsz for disc ctrl
  2023-12-18 15:30 ` [PATCH v3 01/16] nvmet: report ioccsz and iorcsz for disc ctrl Daniel Wagner
@ 2023-12-19  4:27   ` Christoph Hellwig
  2023-12-20  0:50     ` Max Gurtovoy
  2023-12-19  7:24   ` Hannes Reinecke
  1 sibling, 1 reply; 52+ messages in thread
From: Christoph Hellwig @ 2023-12-19  4:27 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: linux-nvme, linux-kernel, Christoph Hellwig, Sagi Grimberg,
	Keith Busch, James Smart, Hannes Reinecke

On Mon, Dec 18, 2023 at 04:30:49PM +0100, Daniel Wagner wrote:
> +	/*
> +	 * Max command capsule size is sqe + in-capsule data size.
> +	 * Disable in-capsule data for Metadata capable controllers.

A why on the disable would be useful here - the fact that it is disabled
is pretty obvious from the code.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 02/16] nvmet-fc: remove unnecessary bracket
  2023-12-18 15:30 ` [PATCH v3 02/16] nvmet-fc: remove unnecessary bracket Daniel Wagner
@ 2023-12-19  4:27   ` Christoph Hellwig
  2023-12-19  7:25   ` Hannes Reinecke
  1 sibling, 0 replies; 52+ messages in thread
From: Christoph Hellwig @ 2023-12-19  4:27 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: linux-nvme, linux-kernel, Christoph Hellwig, Sagi Grimberg,
	Keith Busch, James Smart, Hannes Reinecke

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 03/16] nvmet-trace: avoid dereferencing pointer too early
  2023-12-18 15:30 ` [PATCH v3 03/16] nvmet-trace: avoid dereferencing pointer too early Daniel Wagner
@ 2023-12-19  4:29   ` Christoph Hellwig
  0 siblings, 0 replies; 52+ messages in thread
From: Christoph Hellwig @ 2023-12-19  4:29 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: linux-nvme, linux-kernel, Christoph Hellwig, Sagi Grimberg,
	Keith Busch, James Smart, Hannes Reinecke

On Mon, Dec 18, 2023 at 04:30:51PM +0100, Daniel Wagner wrote:
>  #ifndef TRACE_HEADER_MULTI_READ
> -static inline struct nvmet_ctrl *nvmet_req_to_ctrl(struct nvmet_req *req)
> +static inline u16 nvmet_req_to_ctrl_id(struct nvmet_req *req)
>  {
> -	return req->sq->ctrl;
> +	/*
> +	 * The queue and controller pointer are not valid until an association
> +	 * has been established.

s/pointer/pointers/ ?

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 04/16] nvmet-trace: null terminate device name string correctly
  2023-12-18 15:30 ` [PATCH v3 04/16] nvmet-trace: null terminate device name string correctly Daniel Wagner
@ 2023-12-19  4:29   ` Christoph Hellwig
  0 siblings, 0 replies; 52+ messages in thread
From: Christoph Hellwig @ 2023-12-19  4:29 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: linux-nvme, linux-kernel, Christoph Hellwig, Sagi Grimberg,
	Keith Busch, James Smart, Hannes Reinecke

On Mon, Dec 18, 2023 at 04:30:52PM +0100, Daniel Wagner wrote:
> strlen returns the string length excluding the null byte ('\0'), thus we
> cut the last chars from the device name. While at it, switch snprintf to
> ensure we always have properly terminated string.

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 05/16] nvmet-fcloop: Remove remote port from list when unlinking
  2023-12-18 15:30 ` [PATCH v3 05/16] nvmet-fcloop: Remove remote port from list when unlinking Daniel Wagner
@ 2023-12-19  4:31   ` Christoph Hellwig
  2023-12-19  7:26   ` Hannes Reinecke
  1 sibling, 0 replies; 52+ messages in thread
From: Christoph Hellwig @ 2023-12-19  4:31 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: linux-nvme, linux-kernel, Christoph Hellwig, Sagi Grimberg,
	Keith Busch, James Smart, Hannes Reinecke

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 06/16] nvme-fc: Do not wait in vain when unloading module
  2023-12-18 15:30 ` [PATCH v3 06/16] nvme-fc: Do not wait in vain when unloading module Daniel Wagner
@ 2023-12-19  4:35   ` Christoph Hellwig
  2024-01-05  5:05     ` Keith Busch
  2023-12-19  7:28   ` Hannes Reinecke
  1 sibling, 1 reply; 52+ messages in thread
From: Christoph Hellwig @ 2023-12-19  4:35 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: linux-nvme, linux-kernel, Christoph Hellwig, Sagi Grimberg,
	Keith Busch, James Smart, Hannes Reinecke

On Mon, Dec 18, 2023 at 04:30:54PM +0100, Daniel Wagner wrote:
> The module unload code will wait for a controller to be delete even when
> there is no controller and we wait for completion forever to happen.
> Thus only wait for the completion when there is a controller which
> needs to be removed.

This whole code looks fishy to me, and I suspect this patch only papers
over it.  Why do we this wait to start with?  If we've found that out and
documented it, the code really should be using a wait_event variant that
checks for the actual condition (no more controllers), because without
that you might still have a race otherwise.


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 07/16] nvmet-fc: Release reference on target port
  2023-12-18 15:30 ` [PATCH v3 07/16] nvmet-fc: Release reference on target port Daniel Wagner
@ 2023-12-19  4:36   ` Christoph Hellwig
  2023-12-19  7:28   ` Hannes Reinecke
  1 sibling, 0 replies; 52+ messages in thread
From: Christoph Hellwig @ 2023-12-19  4:36 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: linux-nvme, linux-kernel, Christoph Hellwig, Sagi Grimberg,
	Keith Busch, James Smart, Hannes Reinecke

On Mon, Dec 18, 2023 at 04:30:55PM +0100, Daniel Wagner wrote:
> @@ -360,6 +360,7 @@ __nvmet_fc_finish_ls_req(struct nvmet_fc_ls_req_op *lsop)
>  
>  	if (!lsop->req_queued) {
>  		spin_unlock_irqrestore(&tgtport->lock, flags);
> +		nvmet_fc_tgtport_put(tgtport);
>  		return;

Adding an out_put label and jumping to it would be nice here to keep
the cleanup path common.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 08/16] nvmet-fc: untangle cross refcounting objects
  2023-12-18 15:30 ` [PATCH v3 08/16] nvmet-fc: untangle cross refcounting objects Daniel Wagner
@ 2023-12-19  5:16   ` Christoph Hellwig
  2023-12-21  9:15     ` Daniel Wagner
  2023-12-19  7:59   ` Hannes Reinecke
  1 sibling, 1 reply; 52+ messages in thread
From: Christoph Hellwig @ 2023-12-19  5:16 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: linux-nvme, linux-kernel, Christoph Hellwig, Sagi Grimberg,
	Keith Busch, James Smart, Hannes Reinecke

On Mon, Dec 18, 2023 at 04:30:56PM +0100, Daniel Wagner wrote:
> The live time of the queues are strictly bound to the lifetime of an

> +	struct nvmet_fc_tgt_queue	*_queues[NVMET_NR_QUEUES + 1];
>  	struct nvmet_fc_tgt_queue __rcu	*queues[NVMET_NR_QUEUES + 1];

For magic prefixes we use __, not _ in Linux.  But having two arrays
of queues right next to each other, once with rcu annotation and one
not rings a bit far warning bell to me.  Why do we have both?  When
are we supposed to use either?  Why is FC different from rest?

I really don't have any good answers as I don't know the code in the
FC transport very well, but I think this needs more work.


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 15/16] nvmet-fc: avoid deadlock on delete association path
  2023-12-18 15:31 ` [PATCH v3 15/16] nvmet-fc: avoid deadlock on delete association path Daniel Wagner
@ 2023-12-19  6:00   ` kernel test robot
  2023-12-19  8:29   ` kernel test robot
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 52+ messages in thread
From: kernel test robot @ 2023-12-19  6:00 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: llvm, oe-kbuild-all, linux-kernel, Christoph Hellwig,
	Sagi Grimberg, Keith Busch, James Smart, Hannes Reinecke,
	Daniel Wagner

Hi Daniel,

kernel test robot noticed the following build errors:

[auto build test ERROR on linux-nvme/nvme-6.8]
[also build test ERROR on linus/master v6.7-rc6 next-20231218]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Daniel-Wagner/nvmet-fc-remove-unnecessary-bracket/20231219-003117
base:   git://git.infradead.org/nvme.git nvme-6.8
patch link:    https://lore.kernel.org/r/20231218153105.12717-16-dwagner%40suse.de
patch subject: [PATCH v3 15/16] nvmet-fc: avoid deadlock on delete association path
config: x86_64-rhel-8.3-rust (https://download.01.org/0day-ci/archive/20231219/202312191347.7X0VJgnV-lkp@intel.com/config)
compiler: clang version 16.0.4 (https://github.com/llvm/llvm-project.git ae42196bc493ffe877a7e3dff8be32035dea4d07)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231219/202312191347.7X0VJgnV-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202312191347.7X0VJgnV-lkp@intel.com/

All errors (new ones prefixed by >>):

>> drivers/nvme/target/fc.c:253:2: error: call to undeclared function 'nvmet_fc_tgtport_put'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
           nvmet_fc_tgtport_put(tgtport);
           ^
>> drivers/nvme/target/fc.c:259:13: error: static declaration of 'nvmet_fc_tgtport_put' follows non-static declaration
   static void nvmet_fc_tgtport_put(struct nvmet_fc_tgtport *tgtport);
               ^
   drivers/nvme/target/fc.c:253:2: note: previous implicit declaration is here
           nvmet_fc_tgtport_put(tgtport);
           ^
   2 errors generated.


vim +/nvmet_fc_tgtport_put +253 drivers/nvme/target/fc.c

   244	
   245	
   246	static void nvmet_fc_handle_ls_rqst_work(struct work_struct *work);
   247	static void nvmet_fc_fcp_rqst_op_defer_work(struct work_struct *work);
   248	static void nvmet_fc_put_tgtport_work(struct work_struct *work)
   249	{
   250		struct nvmet_fc_tgtport *tgtport =
   251			container_of(work, struct nvmet_fc_tgtport, put_work);
   252	
 > 253		nvmet_fc_tgtport_put(tgtport);
   254	}
   255	static void nvmet_fc_tgt_a_put(struct nvmet_fc_tgt_assoc *assoc);
   256	static int nvmet_fc_tgt_a_get(struct nvmet_fc_tgt_assoc *assoc);
   257	static void nvmet_fc_tgt_q_put(struct nvmet_fc_tgt_queue *queue);
   258	static int nvmet_fc_tgt_q_get(struct nvmet_fc_tgt_queue *queue);
 > 259	static void nvmet_fc_tgtport_put(struct nvmet_fc_tgtport *tgtport);
   260	static int nvmet_fc_tgtport_get(struct nvmet_fc_tgtport *tgtport);
   261	static void nvmet_fc_handle_fcp_rqst(struct nvmet_fc_tgtport *tgtport,
   262						struct nvmet_fc_fcp_iod *fod);
   263	static void nvmet_fc_delete_target_assoc(struct nvmet_fc_tgt_assoc *assoc);
   264	static void nvmet_fc_xmt_ls_rsp(struct nvmet_fc_tgtport *tgtport,
   265					struct nvmet_fc_ls_iod *iod);
   266	
   267	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 01/16] nvmet: report ioccsz and iorcsz for disc ctrl
  2023-12-18 15:30 ` [PATCH v3 01/16] nvmet: report ioccsz and iorcsz for disc ctrl Daniel Wagner
  2023-12-19  4:27   ` Christoph Hellwig
@ 2023-12-19  7:24   ` Hannes Reinecke
  2023-12-19 15:15     ` Daniel Wagner
  1 sibling, 1 reply; 52+ messages in thread
From: Hannes Reinecke @ 2023-12-19  7:24 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: linux-kernel, Christoph Hellwig, Sagi Grimberg, Keith Busch, James Smart

On 12/18/23 16:30, Daniel Wagner wrote:
> The host started to verify the ioccsz and iorcsz values. I/O controllers
> return valid values but not the discovery controllers. Use the same
> values as for I/O controllers.
> 
> Fixes: 2fcd3ab39826 ("nvme-fabrics: check ioccsz and iorcsz")
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/target/discovery.c | 13 +++++++++++++
>   1 file changed, 13 insertions(+)
> 
> diff --git a/drivers/nvme/target/discovery.c b/drivers/nvme/target/discovery.c
> index 668d257fa986..e3c4d247dd23 100644
> --- a/drivers/nvme/target/discovery.c
> +++ b/drivers/nvme/target/discovery.c
> @@ -249,6 +249,7 @@ static void nvmet_execute_disc_identify(struct nvmet_req *req)
>   {
>   	struct nvmet_ctrl *ctrl = req->sq->ctrl;
>   	struct nvme_id_ctrl *id;
> +	u32 cmd_capsule_size;
>   	u16 status = 0;
>   
>   	if (!nvmet_check_transfer_len(req, NVME_IDENTIFY_DATA_SIZE))
> @@ -294,6 +295,18 @@ static void nvmet_execute_disc_identify(struct nvmet_req *req)
>   
>   	strscpy(id->subnqn, ctrl->subsys->subsysnqn, sizeof(id->subnqn));
>   
> +	/*
> +	 * Max command capsule size is sqe + in-capsule data size.

'is sqe size + in-capsule data size'

> +	 * Disable in-capsule data for Metadata capable controllers.
> +	 */
> +	cmd_capsule_size = sizeof(struct nvme_command);
> +	if (!ctrl->pi_support)
> +		cmd_capsule_size += req->port->inline_data_size;
> +	id->ioccsz = cpu_to_le32(cmd_capsule_size / 16);

Why the division by 16?

> +
> +	/* Max response capsule size is cqe */
'is cqe size'

> +	id->iorcsz = cpu_to_le32(sizeof(struct nvme_completion) / 16);
> +
>   	status = nvmet_copy_to_sgl(req, 0, id, sizeof(*id));
>   
>   	kfree(id);

Otherwise looks good.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Ivo Totev, Andrew McDonald,
Werner Knoblich


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 02/16] nvmet-fc: remove unnecessary bracket
  2023-12-18 15:30 ` [PATCH v3 02/16] nvmet-fc: remove unnecessary bracket Daniel Wagner
  2023-12-19  4:27   ` Christoph Hellwig
@ 2023-12-19  7:25   ` Hannes Reinecke
  1 sibling, 0 replies; 52+ messages in thread
From: Hannes Reinecke @ 2023-12-19  7:25 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: linux-kernel, Christoph Hellwig, Sagi Grimberg, Keith Busch, James Smart

On 12/18/23 16:30, Daniel Wagner wrote:
> There is no need for the bracket around the identifier. Remove it.
> 
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/target/fc.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c
> index bd59990b5250..bda7a3009e85 100644
> --- a/drivers/nvme/target/fc.c
> +++ b/drivers/nvme/target/fc.c
> @@ -1031,7 +1031,7 @@ nvmet_fc_match_hostport(struct nvmet_fc_tgtport *tgtport, void *hosthandle)
>   	list_for_each_entry(host, &tgtport->host_list, host_list) {
>   		if (host->hosthandle == hosthandle && !host->invalid) {
>   			if (nvmet_fc_hostport_get(host))
> -				return (host);
> +				return host;
>   		}
>   	}
>   
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Ivo Totev, Andrew McDonald,
Werner Knoblich


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 05/16] nvmet-fcloop: Remove remote port from list when unlinking
  2023-12-18 15:30 ` [PATCH v3 05/16] nvmet-fcloop: Remove remote port from list when unlinking Daniel Wagner
  2023-12-19  4:31   ` Christoph Hellwig
@ 2023-12-19  7:26   ` Hannes Reinecke
  1 sibling, 0 replies; 52+ messages in thread
From: Hannes Reinecke @ 2023-12-19  7:26 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: linux-kernel, Christoph Hellwig, Sagi Grimberg, Keith Busch, James Smart

On 12/18/23 16:30, Daniel Wagner wrote:
> The remote port is removed too late from fcloop_nports list. Remove it
> when port is unregistered.
> 
> This prevents a busy loop in fcloop_exit, because it is possible the
> remote port is found in the list and thus we will never progress.
> 
> The kernel log will be spammed with
> 
>    nvme_fcloop: fcloop_exit: Failed deleting remote port
>    nvme_fcloop: fcloop_exit: Failed deleting target port
> 
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/target/fcloop.c | 7 ++-----
>   1 file changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/nvme/target/fcloop.c b/drivers/nvme/target/fcloop.c
> index c65a73433c05..ead349af30f1 100644
> --- a/drivers/nvme/target/fcloop.c
> +++ b/drivers/nvme/target/fcloop.c
> @@ -995,11 +995,6 @@ fcloop_nport_free(struct kref *ref)
>   {
>   	struct fcloop_nport *nport =
>   		container_of(ref, struct fcloop_nport, ref);
> -	unsigned long flags;
> -
> -	spin_lock_irqsave(&fcloop_lock, flags);
> -	list_del(&nport->nport_list);
> -	spin_unlock_irqrestore(&fcloop_lock, flags);
>   
>   	kfree(nport);
>   }
> @@ -1357,6 +1352,8 @@ __unlink_remote_port(struct fcloop_nport *nport)
>   		nport->tport->remoteport = NULL;
>   	nport->rport = NULL;
>   
> +	list_del(&nport->nport_list);
> +
>   	return rport;
>   }
>   
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Ivo Totev, Andrew McDonald,
Werner Knoblich


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 06/16] nvme-fc: Do not wait in vain when unloading module
  2023-12-18 15:30 ` [PATCH v3 06/16] nvme-fc: Do not wait in vain when unloading module Daniel Wagner
  2023-12-19  4:35   ` Christoph Hellwig
@ 2023-12-19  7:28   ` Hannes Reinecke
  1 sibling, 0 replies; 52+ messages in thread
From: Hannes Reinecke @ 2023-12-19  7:28 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: linux-kernel, Christoph Hellwig, Sagi Grimberg, Keith Busch, James Smart

On 12/18/23 16:30, Daniel Wagner wrote:
> The module unload code will wait for a controller to be delete even when
> there is no controller and we wait for completion forever to happen.
> Thus only wait for the completion when there is a controller which
> needs to be removed.
> 
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/host/fc.c | 20 +++++++++++++-------
>   1 file changed, 13 insertions(+), 7 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Ivo Totev, Andrew McDonald,
Werner Knoblich


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 07/16] nvmet-fc: Release reference on target port
  2023-12-18 15:30 ` [PATCH v3 07/16] nvmet-fc: Release reference on target port Daniel Wagner
  2023-12-19  4:36   ` Christoph Hellwig
@ 2023-12-19  7:28   ` Hannes Reinecke
  1 sibling, 0 replies; 52+ messages in thread
From: Hannes Reinecke @ 2023-12-19  7:28 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: linux-kernel, Christoph Hellwig, Sagi Grimberg, Keith Busch, James Smart

On 12/18/23 16:30, Daniel Wagner wrote:
> In case we return early out of __nvmet_fc_finish_ls_req() we still have
> to release the reference on the target port.
> 
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/target/fc.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c
> index bda7a3009e85..28e432e62361 100644
> --- a/drivers/nvme/target/fc.c
> +++ b/drivers/nvme/target/fc.c
> @@ -360,6 +360,7 @@ __nvmet_fc_finish_ls_req(struct nvmet_fc_ls_req_op *lsop)
>   
>   	if (!lsop->req_queued) {
>   		spin_unlock_irqrestore(&tgtport->lock, flags);
> +		nvmet_fc_tgtport_put(tgtport);
>   		return;
>   	}
>   
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Ivo Totev, Andrew McDonald,
Werner Knoblich


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 08/16] nvmet-fc: untangle cross refcounting objects
  2023-12-18 15:30 ` [PATCH v3 08/16] nvmet-fc: untangle cross refcounting objects Daniel Wagner
  2023-12-19  5:16   ` Christoph Hellwig
@ 2023-12-19  7:59   ` Hannes Reinecke
  1 sibling, 0 replies; 52+ messages in thread
From: Hannes Reinecke @ 2023-12-19  7:59 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: linux-kernel, Christoph Hellwig, Sagi Grimberg, Keith Busch, James Smart

On 12/18/23 16:30, Daniel Wagner wrote:
> Associations take a refcount on queues, queues take a refcount on
> associations.
> 
> The existing code lead to the situation that the target executes a
> disconnect and the host triggers a reconnect immediately. The reconnect
> command still finds an existing association and uses this. Though the
> reconnect crashes later on because nvmet_fc_delete_target_assoc()
> blindly goes ahead and removes resources while the reconnect code wants
> to use it. The problem is that nvmet_fc_find_target_assoc() is able to
> lookup an association which is being removed.
> 
> So the first thing to address nvmet_fc_find_target_queue() is to remove
> the association out of the list and wait a RCU cycle and free resources
> in the free function callback of the kref_put().
> 
> The live time of the queues are strictly bound to the lifetime of an
> association. Thus we don't need to take reverse refcounts (queue ->
> association).
> 
> Furthermore, streamline the cleanup code by using the workqueue for
> delete the association in nvmet_fc_ls_disconnect. This ensures, that we
> run through the same shutdown path in all non error cases.
> 
> Reproducer: nvme/003
> 
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/target/fc.c | 83 +++++++++++++++++++---------------------
>   1 file changed, 40 insertions(+), 43 deletions(-)
> 
> diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c
> index 28e432e62361..db992df13c73 100644
> --- a/drivers/nvme/target/fc.c
> +++ b/drivers/nvme/target/fc.c
> @@ -166,6 +166,7 @@ struct nvmet_fc_tgt_assoc {
>   	struct nvmet_fc_hostport	*hostport;
>   	struct nvmet_fc_ls_iod		*rcv_disconn;
>   	struct list_head		a_list;
> +	struct nvmet_fc_tgt_queue	*_queues[NVMET_NR_QUEUES + 1];
>   	struct nvmet_fc_tgt_queue __rcu	*queues[NVMET_NR_QUEUES + 1];
>   	struct kref			ref;
>   	struct work_struct		del_work;
> @@ -803,14 +804,11 @@ nvmet_fc_alloc_target_queue(struct nvmet_fc_tgt_assoc *assoc,
>   	if (!queue)
>   		return NULL;
>   
> -	if (!nvmet_fc_tgt_a_get(assoc))
> -		goto out_free_queue;
> -
>   	queue->work_q = alloc_workqueue("ntfc%d.%d.%d", 0, 0,
>   				assoc->tgtport->fc_target_port.port_num,
>   				assoc->a_id, qid);
>   	if (!queue->work_q)
> -		goto out_a_put;
> +		goto out_free_queue;
>   
>   	queue->qid = qid;
>   	queue->sqsize = sqsize;
> @@ -831,7 +829,8 @@ nvmet_fc_alloc_target_queue(struct nvmet_fc_tgt_assoc *assoc,
>   	if (ret)
>   		goto out_fail_iodlist;
>   
> -	WARN_ON(assoc->queues[qid]);
> +	WARN_ON(assoc->_queues[qid]);
> +	assoc->_queues[qid] = queue;
>   	rcu_assign_pointer(assoc->queues[qid], queue);
>   
Is it really worth is using an rcu pointer here?
In the end, creating and deleting queues for an association don't happen 
that often, and involve some synchronization points anyway.
IE the lifetime of the queue is actually bounded by the lifetime of the
association itself, so if the association is valid the queues will be
valid, too.

With that reasoning can't we drop rcu above and use the array directly,
delegating any synchronization to the association itself?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Ivo Totev, Andrew McDonald,
Werner Knoblich


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 09/16] nvmet-fc: free queue and assoc directly
  2023-12-18 15:30 ` [PATCH v3 09/16] nvmet-fc: free queue and assoc directly Daniel Wagner
@ 2023-12-19  7:59   ` Hannes Reinecke
  0 siblings, 0 replies; 52+ messages in thread
From: Hannes Reinecke @ 2023-12-19  7:59 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: linux-kernel, Christoph Hellwig, Sagi Grimberg, Keith Busch, James Smart

On 12/18/23 16:30, Daniel Wagner wrote:
> Neither struct nvmet_fc_tgt_queue nor struct nvmet_fc_tgt_assoc are data
> structure which are used in a RCU context. So there is no reason to
> delay the free operation.
> 
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/target/fc.c | 6 ++----
>   1 file changed, 2 insertions(+), 4 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Ivo Totev, Andrew McDonald,
Werner Knoblich


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 10/16] nvmet-fc: hold reference on hostport match
  2023-12-18 15:30 ` [PATCH v3 10/16] nvmet-fc: hold reference on hostport match Daniel Wagner
@ 2023-12-19  8:00   ` Hannes Reinecke
  0 siblings, 0 replies; 52+ messages in thread
From: Hannes Reinecke @ 2023-12-19  8:00 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: linux-kernel, Christoph Hellwig, Sagi Grimberg, Keith Busch, James Smart

On 12/18/23 16:30, Daniel Wagner wrote:
> The hostport data structure is shared between the association, this why
> we keep track of the users via a refcount. So we should not decrement
> the refcount on a match and free the hostport several times.
> 
> Reported by KASAN.
> 
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/target/fc.c | 2 --
>   1 file changed, 2 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Ivo Totev, Andrew McDonald,
Werner Knoblich


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 15/16] nvmet-fc: avoid deadlock on delete association path
  2023-12-18 15:31 ` [PATCH v3 15/16] nvmet-fc: avoid deadlock on delete association path Daniel Wagner
  2023-12-19  6:00   ` kernel test robot
@ 2023-12-19  8:29   ` kernel test robot
  2023-12-19 11:17   ` kernel test robot
  2023-12-19 11:43   ` Hannes Reinecke
  3 siblings, 0 replies; 52+ messages in thread
From: kernel test robot @ 2023-12-19  8:29 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: oe-kbuild-all, linux-kernel, Christoph Hellwig, Sagi Grimberg,
	Keith Busch, James Smart, Hannes Reinecke, Daniel Wagner

Hi Daniel,

kernel test robot noticed the following build errors:

[auto build test ERROR on linux-nvme/nvme-6.8]
[also build test ERROR on linus/master v6.7-rc6 next-20231219]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Daniel-Wagner/nvmet-fc-remove-unnecessary-bracket/20231219-003117
base:   git://git.infradead.org/nvme.git nvme-6.8
patch link:    https://lore.kernel.org/r/20231218153105.12717-16-dwagner%40suse.de
patch subject: [PATCH v3 15/16] nvmet-fc: avoid deadlock on delete association path
config: loongarch-defconfig (https://download.01.org/0day-ci/archive/20231219/202312191634.ASof5mm8-lkp@intel.com/config)
compiler: loongarch64-linux-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231219/202312191634.ASof5mm8-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202312191634.ASof5mm8-lkp@intel.com/

All error/warnings (new ones prefixed by >>):

   drivers/nvme/target/fc.c: In function 'nvmet_fc_put_tgtport_work':
>> drivers/nvme/target/fc.c:253:9: error: implicit declaration of function 'nvmet_fc_tgtport_put'; did you mean 'nvmet_ctrl_put'? [-Werror=implicit-function-declaration]
     253 |         nvmet_fc_tgtport_put(tgtport);
         |         ^~~~~~~~~~~~~~~~~~~~
         |         nvmet_ctrl_put
   drivers/nvme/target/fc.c: At top level:
>> drivers/nvme/target/fc.c:259:13: warning: conflicting types for 'nvmet_fc_tgtport_put'; have 'void(struct nvmet_fc_tgtport *)'
     259 | static void nvmet_fc_tgtport_put(struct nvmet_fc_tgtport *tgtport);
         |             ^~~~~~~~~~~~~~~~~~~~
   drivers/nvme/target/fc.c:259:13: error: static declaration of 'nvmet_fc_tgtport_put' follows non-static declaration
   drivers/nvme/target/fc.c:253:9: note: previous implicit declaration of 'nvmet_fc_tgtport_put' with type 'void(struct nvmet_fc_tgtport *)'
     253 |         nvmet_fc_tgtport_put(tgtport);
         |         ^~~~~~~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors


vim +253 drivers/nvme/target/fc.c

   244	
   245	
   246	static void nvmet_fc_handle_ls_rqst_work(struct work_struct *work);
   247	static void nvmet_fc_fcp_rqst_op_defer_work(struct work_struct *work);
   248	static void nvmet_fc_put_tgtport_work(struct work_struct *work)
   249	{
   250		struct nvmet_fc_tgtport *tgtport =
   251			container_of(work, struct nvmet_fc_tgtport, put_work);
   252	
 > 253		nvmet_fc_tgtport_put(tgtport);
   254	}
   255	static void nvmet_fc_tgt_a_put(struct nvmet_fc_tgt_assoc *assoc);
   256	static int nvmet_fc_tgt_a_get(struct nvmet_fc_tgt_assoc *assoc);
   257	static void nvmet_fc_tgt_q_put(struct nvmet_fc_tgt_queue *queue);
   258	static int nvmet_fc_tgt_q_get(struct nvmet_fc_tgt_queue *queue);
 > 259	static void nvmet_fc_tgtport_put(struct nvmet_fc_tgtport *tgtport);
   260	static int nvmet_fc_tgtport_get(struct nvmet_fc_tgtport *tgtport);
   261	static void nvmet_fc_handle_fcp_rqst(struct nvmet_fc_tgtport *tgtport,
   262						struct nvmet_fc_fcp_iod *fod);
   263	static void nvmet_fc_delete_target_assoc(struct nvmet_fc_tgt_assoc *assoc);
   264	static void nvmet_fc_xmt_ls_rsp(struct nvmet_fc_tgtport *tgtport,
   265					struct nvmet_fc_ls_iod *iod);
   266	
   267	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 15/16] nvmet-fc: avoid deadlock on delete association path
  2023-12-18 15:31 ` [PATCH v3 15/16] nvmet-fc: avoid deadlock on delete association path Daniel Wagner
  2023-12-19  6:00   ` kernel test robot
  2023-12-19  8:29   ` kernel test robot
@ 2023-12-19 11:17   ` kernel test robot
  2023-12-19 11:43   ` Hannes Reinecke
  3 siblings, 0 replies; 52+ messages in thread
From: kernel test robot @ 2023-12-19 11:17 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: oe-kbuild-all, linux-kernel, Christoph Hellwig, Sagi Grimberg,
	Keith Busch, James Smart, Hannes Reinecke, Daniel Wagner

Hi Daniel,

kernel test robot noticed the following build warnings:

[auto build test WARNING on linux-nvme/nvme-6.8]
[also build test WARNING on linus/master v6.7-rc6 next-20231219]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Daniel-Wagner/nvmet-fc-remove-unnecessary-bracket/20231219-003117
base:   git://git.infradead.org/nvme.git nvme-6.8
patch link:    https://lore.kernel.org/r/20231218153105.12717-16-dwagner%40suse.de
patch subject: [PATCH v3 15/16] nvmet-fc: avoid deadlock on delete association path
config: i386-randconfig-013-20231219 (https://download.01.org/0day-ci/archive/20231219/202312191845.lwrs1kbh-lkp@intel.com/config)
compiler: gcc-7 (Ubuntu 7.5.0-6ubuntu2) 7.5.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231219/202312191845.lwrs1kbh-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202312191845.lwrs1kbh-lkp@intel.com/

All warnings (new ones prefixed by >>):

   drivers/nvme/target/fc.c: In function 'nvmet_fc_put_tgtport_work':
   drivers/nvme/target/fc.c:253:2: error: implicit declaration of function 'nvmet_fc_tgtport_put'; did you mean 'nvmet_ctrl_put'? [-Werror=implicit-function-declaration]
     nvmet_fc_tgtport_put(tgtport);
     ^~~~~~~~~~~~~~~~~~~~
     nvmet_ctrl_put
   drivers/nvme/target/fc.c: At top level:
>> drivers/nvme/target/fc.c:259:13: warning: conflicting types for 'nvmet_fc_tgtport_put'
    static void nvmet_fc_tgtport_put(struct nvmet_fc_tgtport *tgtport);
                ^~~~~~~~~~~~~~~~~~~~
   drivers/nvme/target/fc.c:259:13: error: static declaration of 'nvmet_fc_tgtport_put' follows non-static declaration
   drivers/nvme/target/fc.c:253:2: note: previous implicit declaration of 'nvmet_fc_tgtport_put' was here
     nvmet_fc_tgtport_put(tgtport);
     ^~~~~~~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors


vim +/nvmet_fc_tgtport_put +259 drivers/nvme/target/fc.c

c53432030d8642 James Smart   2016-12-02  244  
c53432030d8642 James Smart   2016-12-02  245  
c53432030d8642 James Smart   2016-12-02  246  static void nvmet_fc_handle_ls_rqst_work(struct work_struct *work);
9d625f7792875e James Smart   2018-02-28  247  static void nvmet_fc_fcp_rqst_op_defer_work(struct work_struct *work);
20d5f3b830ab45 Daniel Wagner 2023-12-18  248  static void nvmet_fc_put_tgtport_work(struct work_struct *work)
20d5f3b830ab45 Daniel Wagner 2023-12-18  249  {
20d5f3b830ab45 Daniel Wagner 2023-12-18  250  	struct nvmet_fc_tgtport *tgtport =
20d5f3b830ab45 Daniel Wagner 2023-12-18  251  		container_of(work, struct nvmet_fc_tgtport, put_work);
20d5f3b830ab45 Daniel Wagner 2023-12-18  252  
20d5f3b830ab45 Daniel Wagner 2023-12-18  253  	nvmet_fc_tgtport_put(tgtport);
20d5f3b830ab45 Daniel Wagner 2023-12-18  254  }
c53432030d8642 James Smart   2016-12-02  255  static void nvmet_fc_tgt_a_put(struct nvmet_fc_tgt_assoc *assoc);
c53432030d8642 James Smart   2016-12-02  256  static int nvmet_fc_tgt_a_get(struct nvmet_fc_tgt_assoc *assoc);
c53432030d8642 James Smart   2016-12-02  257  static void nvmet_fc_tgt_q_put(struct nvmet_fc_tgt_queue *queue);
c53432030d8642 James Smart   2016-12-02  258  static int nvmet_fc_tgt_q_get(struct nvmet_fc_tgt_queue *queue);
c53432030d8642 James Smart   2016-12-02 @259  static void nvmet_fc_tgtport_put(struct nvmet_fc_tgtport *tgtport);
c53432030d8642 James Smart   2016-12-02  260  static int nvmet_fc_tgtport_get(struct nvmet_fc_tgtport *tgtport);
0fb228d30b8d72 James Smart   2017-08-01  261  static void nvmet_fc_handle_fcp_rqst(struct nvmet_fc_tgtport *tgtport,
0fb228d30b8d72 James Smart   2017-08-01  262  					struct nvmet_fc_fcp_iod *fod);
a96d4bd867129e James Smart   2017-10-27  263  static void nvmet_fc_delete_target_assoc(struct nvmet_fc_tgt_assoc *assoc);
47bf3241064498 James Smart   2020-03-31  264  static void nvmet_fc_xmt_ls_rsp(struct nvmet_fc_tgtport *tgtport,
47bf3241064498 James Smart   2020-03-31  265  				struct nvmet_fc_ls_iod *iod);
c53432030d8642 James Smart   2016-12-02  266  
c53432030d8642 James Smart   2016-12-02  267  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 11/16] nvmet-fc: remove null hostport pointer check
  2023-12-18 15:30 ` [PATCH v3 11/16] nvmet-fc: remove null hostport pointer check Daniel Wagner
@ 2023-12-19 11:37   ` Hannes Reinecke
  0 siblings, 0 replies; 52+ messages in thread
From: Hannes Reinecke @ 2023-12-19 11:37 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: linux-kernel, Christoph Hellwig, Sagi Grimberg, Keith Busch, James Smart

On 12/18/23 16:30, Daniel Wagner wrote:
> An association has always a valid hostport pointer. Remove useless
> null pointer check.
> 
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/target/fc.c | 6 ++----
>   1 file changed, 2 insertions(+), 4 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Ivo Totev, Andrew McDonald,
Werner Knoblich


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 12/16] nvmet-fc: do not tack refs on tgtports from assoc
  2023-12-18 15:31 ` [PATCH v3 12/16] nvmet-fc: do not tack refs on tgtports from assoc Daniel Wagner
@ 2023-12-19 11:38   ` Hannes Reinecke
  0 siblings, 0 replies; 52+ messages in thread
From: Hannes Reinecke @ 2023-12-19 11:38 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: linux-kernel, Christoph Hellwig, Sagi Grimberg, Keith Busch, James Smart

On 12/18/23 16:31, Daniel Wagner wrote:
> The association life time is tight to the life time of the target port.
                                tied

> That means we do not take extra a refcount when creating a association.
                 should not

> 
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/target/fc.c | 8 +-------
>   1 file changed, 1 insertion(+), 7 deletions(-)
> 
> diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c
> index c243085d6f42..47cecc8c72b2 100644
> --- a/drivers/nvme/target/fc.c
> +++ b/drivers/nvme/target/fc.c
> @@ -1109,12 +1109,9 @@ nvmet_fc_alloc_target_assoc(struct nvmet_fc_tgtport *tgtport, void *hosthandle)
>   	if (idx < 0)
>   		goto out_free_assoc;
>   
> -	if (!nvmet_fc_tgtport_get(tgtport))
> -		goto out_ida;
> -
>   	assoc->hostport = nvmet_fc_alloc_hostport(tgtport, hosthandle);
>   	if (IS_ERR(assoc->hostport))
> -		goto out_put;
> +		goto out_ida;
>   
>   	assoc->tgtport = tgtport;
>   	assoc->a_id = idx;
> @@ -1144,8 +1141,6 @@ nvmet_fc_alloc_target_assoc(struct nvmet_fc_tgtport *tgtport, void *hosthandle)
>   
>   	return assoc;
>   
> -out_put:
> -	nvmet_fc_tgtport_put(tgtport);
>   out_ida:
>   	ida_free(&tgtport->assoc_cnt, idx);
>   out_free_assoc:
> @@ -1182,7 +1177,6 @@ nvmet_fc_target_assoc_free(struct kref *ref)
>   	dev_info(tgtport->dev,
>   		"{%d:%d} Association freed\n",
>   		tgtport->fc_target_port.port_num, assoc->a_id);
> -	nvmet_fc_tgtport_put(tgtport);
>   	kfree(assoc);
>   }
>   
Otherwise:
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Ivo Totev, Andrew McDonald,
Werner Knoblich


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 13/16] nvmet-fc: abort command if when there is binding
  2023-12-18 15:31 ` [PATCH v3 13/16] nvmet-fc: abort command if when there is binding Daniel Wagner
@ 2023-12-19 11:39   ` Hannes Reinecke
  0 siblings, 0 replies; 52+ messages in thread
From: Hannes Reinecke @ 2023-12-19 11:39 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: linux-kernel, Christoph Hellwig, Sagi Grimberg, Keith Busch, James Smart

On 12/18/23 16:31, Daniel Wagner wrote:
> WHen the target port has not active port binding, there is no point in
> trying to process the command as it has to fail anyway. Instead adding
> checks to all commands abort the command early.
> 
Please fix up the subject: 'abort command when there is no binding'

> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/target/fc.c | 8 ++++++--
>   1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c
> index 47cecc8c72b2..663c51c9fe53 100644
> --- a/drivers/nvme/target/fc.c
> +++ b/drivers/nvme/target/fc.c
> @@ -1101,6 +1101,9 @@ nvmet_fc_alloc_target_assoc(struct nvmet_fc_tgtport *tgtport, void *hosthandle)
>   	int idx;
>   	bool needrandom = true;
>   
> +	if (!tgtport->pe)
> +		return NULL;
> +
>   	assoc = kzalloc(sizeof(*assoc), GFP_KERNEL);
>   	if (!assoc)
>   		return NULL;
> @@ -2520,8 +2523,9 @@ nvmet_fc_handle_fcp_rqst(struct nvmet_fc_tgtport *tgtport,
>   
>   	fod->req.cmd = &fod->cmdiubuf.sqe;
>   	fod->req.cqe = &fod->rspiubuf.cqe;
> -	if (tgtport->pe)
> -		fod->req.port = tgtport->pe->port;
> +	if (!tgtport->pe)
> +		goto transport_error;
> +	fod->req.port = tgtport->pe->port;
>   
>   	/* clear any response payload */
>   	memset(&fod->rspiubuf, 0, sizeof(fod->rspiubuf));

Otherwise:

Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Ivo Totev, Andrew McDonald,
Werner Knoblich


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 14/16] nvmet-fc: free hostport after release reference to tgtport
  2023-12-18 15:31 ` [PATCH v3 14/16] nvmet-fc: free hostport after release reference to tgtport Daniel Wagner
@ 2023-12-19 11:41   ` Hannes Reinecke
  0 siblings, 0 replies; 52+ messages in thread
From: Hannes Reinecke @ 2023-12-19 11:41 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: linux-kernel, Christoph Hellwig, Sagi Grimberg, Keith Busch, James Smart

On 12/18/23 16:31, Daniel Wagner wrote:
> Give the ref back before destroying the hostport object to prevent a
> potential UAF.
> 
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/target/fc.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c
> index 663c51c9fe53..23d8779dc221 100644
> --- a/drivers/nvme/target/fc.c
> +++ b/drivers/nvme/target/fc.c
> @@ -986,8 +986,8 @@ nvmet_fc_hostport_free(struct kref *ref)
>   	spin_unlock_irqrestore(&tgtport->lock, flags);
>   	if (tgtport->ops->host_release && hostport->invalid)
>   		tgtport->ops->host_release(hostport->hosthandle);
> -	kfree(hostport);
>   	nvmet_fc_tgtport_put(tgtport);
> +	kfree(hostport);
>   }
>   
>   static void

That, I guess, needs some more explanation.
It's not immediately obvious why a 'put' on the targetport would
have implications on the hostport. But when it does wouldn't it
be more sensible to free up the reference to the hostport when the
final 'put' on the target port happens?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Ivo Totev, Andrew McDonald,
Werner Knoblich


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 15/16] nvmet-fc: avoid deadlock on delete association path
  2023-12-18 15:31 ` [PATCH v3 15/16] nvmet-fc: avoid deadlock on delete association path Daniel Wagner
                     ` (2 preceding siblings ...)
  2023-12-19 11:17   ` kernel test robot
@ 2023-12-19 11:43   ` Hannes Reinecke
  3 siblings, 0 replies; 52+ messages in thread
From: Hannes Reinecke @ 2023-12-19 11:43 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: linux-kernel, Christoph Hellwig, Sagi Grimberg, Keith Busch, James Smart

On 12/18/23 16:31, Daniel Wagner wrote:
> When deleting an association the shutdown path is deadlocking because we
> try to flush the nvmet_wq nested. Avoid this by deadlock by deferring
> the put work into its own work item.
> 
Maybe deleting the first 'by' ?

> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/target/fc.c | 14 ++++++++++++--
>   1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c
> index 23d8779dc221..30ba4ede333f 100644
> --- a/drivers/nvme/target/fc.c
> +++ b/drivers/nvme/target/fc.c
> @@ -111,6 +111,8 @@ struct nvmet_fc_tgtport {
>   	struct nvmet_fc_port_entry	*pe;
>   	struct kref			ref;
>   	u32				max_sg_cnt;
> +
> +	struct work_struct		put_work;
>   };
>   
>   struct nvmet_fc_port_entry {
> @@ -243,6 +245,13 @@ static LIST_HEAD(nvmet_fc_portentry_list);
>   
>   static void nvmet_fc_handle_ls_rqst_work(struct work_struct *work);
>   static void nvmet_fc_fcp_rqst_op_defer_work(struct work_struct *work);
> +static void nvmet_fc_put_tgtport_work(struct work_struct *work)
> +{
> +	struct nvmet_fc_tgtport *tgtport =
> +		container_of(work, struct nvmet_fc_tgtport, put_work);
> +
> +	nvmet_fc_tgtport_put(tgtport);
> +}
>   static void nvmet_fc_tgt_a_put(struct nvmet_fc_tgt_assoc *assoc);
>   static int nvmet_fc_tgt_a_get(struct nvmet_fc_tgt_assoc *assoc);
>   static void nvmet_fc_tgt_q_put(struct nvmet_fc_tgt_queue *queue);
> @@ -359,7 +368,7 @@ __nvmet_fc_finish_ls_req(struct nvmet_fc_ls_req_op *lsop)
>   
>   	if (!lsop->req_queued) {
>   		spin_unlock_irqrestore(&tgtport->lock, flags);
> -		nvmet_fc_tgtport_put(tgtport);
> +		queue_work(nvmet_wq, &tgtport->put_work);
>   		return;
>   	}
>   
> @@ -373,7 +382,7 @@ __nvmet_fc_finish_ls_req(struct nvmet_fc_ls_req_op *lsop)
>   				  (lsreq->rqstlen + lsreq->rsplen),
>   				  DMA_BIDIRECTIONAL);
>   
> -	nvmet_fc_tgtport_put(tgtport);
> +	queue_work(nvmet_wq, &tgtport->put_work);
>   }
>   
>   static int
> @@ -1402,6 +1411,7 @@ nvmet_fc_register_targetport(struct nvmet_fc_port_info *pinfo,
>   	kref_init(&newrec->ref);
>   	ida_init(&newrec->assoc_cnt);
>   	newrec->max_sg_cnt = template->max_sgl_segments;
> +	INIT_WORK(&newrec->put_work, nvmet_fc_put_tgtport_work);
>   
>   	ret = nvmet_fc_alloc_ls_iodlist(newrec);
>   	if (ret) {
Otherwise:

Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Ivo Totev, Andrew McDonald,
Werner Knoblich


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 16/16] nvmet-fc: take ref count on tgtport before delete assoc
  2023-12-18 15:31 ` [PATCH v3 16/16] nvmet-fc: take ref count on tgtport before delete assoc Daniel Wagner
@ 2023-12-19 11:47   ` Hannes Reinecke
  0 siblings, 0 replies; 52+ messages in thread
From: Hannes Reinecke @ 2023-12-19 11:47 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: linux-kernel, Christoph Hellwig, Sagi Grimberg, Keith Busch, James Smart

On 12/18/23 16:31, Daniel Wagner wrote:
> We have to ensure that the tgtport is not going away
> before be have remove all the associations.

before we have removed all the associations.

> 
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/target/fc.c | 31 +++++++++++++++++++++++--------
>   1 file changed, 23 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c
> index 30ba4ede333f..455d35ef97eb 100644
> --- a/drivers/nvme/target/fc.c
> +++ b/drivers/nvme/target/fc.c
> @@ -1092,13 +1092,28 @@ nvmet_fc_alloc_hostport(struct nvmet_fc_tgtport *tgtport, void *hosthandle)
>   }
>   
>   static void
> -nvmet_fc_delete_assoc(struct work_struct *work)
> +nvmet_fc_delete_assoc(struct nvmet_fc_tgt_assoc *assoc)
> +{
> +	nvmet_fc_delete_target_assoc(assoc);
> +	nvmet_fc_tgt_a_put(assoc);
> +}
> +
> +static void
> +nvmet_fc_delete_assoc_work(struct work_struct *work)
>   {
>   	struct nvmet_fc_tgt_assoc *assoc =
>   		container_of(work, struct nvmet_fc_tgt_assoc, del_work);
> +	struct nvmet_fc_tgtport *tgtport = assoc->tgtport;
>   
> -	nvmet_fc_delete_target_assoc(assoc);
> -	nvmet_fc_tgt_a_put(assoc);
> +	nvmet_fc_delete_assoc(assoc);
> +	nvmet_fc_tgtport_put(tgtport);
> +}
> +
> +static void
> +nvmet_fc_schedule_delete_assoc(struct nvmet_fc_tgt_assoc *assoc)
> +{
> +	nvmet_fc_tgtport_get(assoc->tgtport);
> +	queue_work(nvmet_wq, &assoc->del_work);

Errm.
That is dangerous, as it will leak a reference if 'del_work' is already 
queued.
And we already took a reference from the caller. Why do we need two 
references here?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Ivo Totev, Andrew McDonald,
Werner Knoblich


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 17/17] nvmet-fc: add extensive debug logging
  2023-12-18 15:31 ` [PATCH v3 17/17] nvmet-fc: add extensive debug logging Daniel Wagner
@ 2023-12-19 11:54   ` Hannes Reinecke
  2023-12-19 14:06     ` Daniel Wagner
  0 siblings, 1 reply; 52+ messages in thread
From: Hannes Reinecke @ 2023-12-19 11:54 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme
  Cc: linux-kernel, Christoph Hellwig, Sagi Grimberg, Keith Busch, James Smart

On 12/18/23 16:31, Daniel Wagner wrote:
> Signed-off-by: Daniel Wagner <dwagner@suse.de>

Well ... all previous patches had 'XX/16', only this has '17/17'.
Late addendum?
But in either case, please ensure correct counting (and spelling!)
on the next submission.

> ---
>   drivers/nvme/target/configfs.c |   4 +
>   drivers/nvme/target/core.c     |  13 ++++
>   drivers/nvme/target/fc.c       | 132 +++++++++++++++++++++++++++++----
>   3 files changed, 135 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/nvme/target/configfs.c b/drivers/nvme/target/configfs.c
> index e307a044b1a1..ea05e8c62d4b 100644
> --- a/drivers/nvme/target/configfs.c
> +++ b/drivers/nvme/target/configfs.c
> @@ -965,6 +965,7 @@ static int nvmet_port_subsys_allow_link(struct config_item *parent,
>   			goto out_free_link;
>   	}
>   
> +	pr_info("%s: %s\n", __func__, subsys->subsysnqn);

If this is debug logging, would 'pr_debug' be more accurate?

>   	if (list_empty(&port->subsystems)) {
>   		ret = nvmet_enable_port(port);
>   		if (ret)
> @@ -1050,6 +1051,7 @@ static int nvmet_allowed_hosts_allow_link(struct config_item *parent,
>   		if (!strcmp(nvmet_host_name(p->host), nvmet_host_name(host)))
>   			goto out_free_link;
>   	}
> +	pr_info("%s: adding hostnqn %s\n", __func__, nvmet_host_name(host));
>   	list_add_tail(&link->entry, &subsys->hosts);
>   	nvmet_subsys_disc_changed(subsys, host);
>   
> @@ -1879,6 +1881,8 @@ static struct config_group *nvmet_ports_make(struct config_group *group,
>   	u16 portid;
>   	u32 i;
>   
> +	pr_info("%s\n", __func__);
> +
The meaning of this is ... what?
Of course this function is called when a port is created. So why do you 
need to log that in the message log?

>   	if (kstrtou16(name, 0, &portid))
>   		return ERR_PTR(-EINVAL);
>   
> diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c
> index 3935165048e7..4d5a9e4fcc9d 100644
> --- a/drivers/nvme/target/core.c
> +++ b/drivers/nvme/target/core.c
> @@ -308,8 +308,11 @@ void nvmet_port_del_ctrls(struct nvmet_port *port, struct nvmet_subsys *subsys)
>   {
>   	struct nvmet_ctrl *ctrl;
>   
> +	pr_info("%s: subsys %s port %p\n", __func__, subsys->subsysnqn, port);
> +
pr_debug()

>   	mutex_lock(&subsys->lock);
>   	list_for_each_entry(ctrl, &subsys->ctrls, subsys_entry) {
> +		pr_info("%s: ctrl %p ctrl->port %p\n", __func__, ctrl, ctrl->port);
>   		if (ctrl->port == port)
>   			ctrl->ops->delete_ctrl(ctrl);
>   	}
> @@ -1458,6 +1461,8 @@ u16 nvmet_alloc_ctrl(const char *subsysnqn, const char *hostnqn,
>   	mutex_unlock(&subsys->lock);
>   
>   	*ctrlp = ctrl;
> +
> +	pr_info("%s: ctrl %p, subsysnqn %s hostnqn %s\n", __func__, ctrl, subsysnqn, hostnqn);

Wouldn't it be better to list the controller id, too?

>   	return 0;
>   
>   out_free_sqs:
> @@ -1477,6 +1482,8 @@ static void nvmet_ctrl_free(struct kref *ref)
>   	struct nvmet_ctrl *ctrl = container_of(ref, struct nvmet_ctrl, ref);
>   	struct nvmet_subsys *subsys = ctrl->subsys;
>   
> +	pr_info("%s: ctrl %p %s\n", __func__, ctrl, ctrl->subsysnqn);
> +

Please, no pointer, use the controller id instead.

>   	mutex_lock(&subsys->lock);
>   	nvmet_release_p2p_ns_map(ctrl);
>   	list_del(&ctrl->subsys_entry);
> @@ -1550,6 +1557,8 @@ struct nvmet_subsys *nvmet_subsys_alloc(const char *subsysnqn,
>   	char serial[NVMET_SN_MAX_SIZE / 2];
>   	int ret;
>   
> +	pr_info("%s\n", __func__);
> +
See above. Pretty pointless.

>   	subsys = kzalloc(sizeof(*subsys), GFP_KERNEL);
>   	if (!subsys)
>   		return ERR_PTR(-ENOMEM);
> @@ -1620,6 +1629,8 @@ static void nvmet_subsys_free(struct kref *ref)
>   
>   	WARN_ON_ONCE(!xa_empty(&subsys->namespaces));
>   
> +	pr_info("%s\n", __func__);
> +
>   	xa_destroy(&subsys->namespaces);
>   	nvmet_passthru_subsys_free(subsys);
>   
> @@ -1633,6 +1644,8 @@ void nvmet_subsys_del_ctrls(struct nvmet_subsys *subsys)
>   {
>   	struct nvmet_ctrl *ctrl;
>   
> +	pr_info("%s\n", __func__);
> +
>   	mutex_lock(&subsys->lock);
>   	list_for_each_entry(ctrl, &subsys->ctrls, subsys_entry)
>   		ctrl->ops->delete_ctrl(ctrl);
> diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c
> index 455d35ef97eb..d50ff29697fc 100644
> --- a/drivers/nvme/target/fc.c
> +++ b/drivers/nvme/target/fc.c
> @@ -242,6 +242,31 @@ static LIST_HEAD(nvmet_fc_target_list);
>   static DEFINE_IDA(nvmet_fc_tgtport_cnt);
>   static LIST_HEAD(nvmet_fc_portentry_list);
>   
> +static void __nvmet_fc_tgtport_put(struct nvmet_fc_tgtport *tgtport);
> +static int __nvmet_fc_tgtport_get(struct nvmet_fc_tgtport *tgtport);
> +
> +#if 1
> +#define nvmet_fc_tgtport_put(p)						\
> +({									\
> +	pr_info("nvmet_fc_tgtport_put: %p %d %s:%d\n",	\
> +		p, atomic_read(&p->ref.refcount.refs),			\
> +		__func__, __LINE__);					\
> +	__nvmet_fc_tgtport_put(p);					\
> +})
> +
> +#define nvmet_fc_tgtport_get(p)						\
> +({									\
> +	int ___r = __nvmet_fc_tgtport_get(p);				\
> +									\
> +	pr_info("nvmet_fc_tgtport_get: %p %d %s:%d\n",			\
> +		p, atomic_read(&p->ref.refcount.refs),			\
> +		__func__, __LINE__);					\
> +	___r;								\
> +})
> +#else
> +#define nvmet_fc_tgtport_put(p) __nvmet_fc_tgtport_put(p)
> +#define nvmet_fc_tgtport_get(p) __nvmet_fc_tgtport_get(p)
> +#endif
>   
>   static void nvmet_fc_handle_ls_rqst_work(struct work_struct *work);
>   static void nvmet_fc_fcp_rqst_op_defer_work(struct work_struct *work);
> @@ -252,12 +277,84 @@ static void nvmet_fc_put_tgtport_work(struct work_struct *work)
>   
>   	nvmet_fc_tgtport_put(tgtport);
>   }
> -static void nvmet_fc_tgt_a_put(struct nvmet_fc_tgt_assoc *assoc);
> -static int nvmet_fc_tgt_a_get(struct nvmet_fc_tgt_assoc *assoc);
> -static void nvmet_fc_tgt_q_put(struct nvmet_fc_tgt_queue *queue);
> -static int nvmet_fc_tgt_q_get(struct nvmet_fc_tgt_queue *queue);
> -static void nvmet_fc_tgtport_put(struct nvmet_fc_tgtport *tgtport);
> -static int nvmet_fc_tgtport_get(struct nvmet_fc_tgtport *tgtport);
> +static void __nvmet_fc_tgt_a_put(struct nvmet_fc_tgt_assoc *assoc);
> +static int __nvmet_fc_tgt_a_get(struct nvmet_fc_tgt_assoc *assoc);
> +
> +#if 1
> +#define nvmet_fc_tgt_a_put(a)						\
> +({									\
> +	pr_info("nvmet_fc_tgt_a_put: %d %d %s:%d \n",			\
> +		a->a_id, atomic_read(&a->ref.refcount.refs),		\
> +		__func__, __LINE__);					\
> +	__nvmet_fc_tgt_a_put(a);					\
> +})
> +
> +#define nvmet_fc_tgt_a_get(a)						\
> +({									\
> +	int ___r = __nvmet_fc_tgt_a_get(a);				\
> +									\
> +	pr_info("nvmet_fc_tgt_a_get: %d %d %s:%d\n",			\
> +		a->a_id, atomic_read(&a->ref.refcount.refs),		\
> +		__func__, __LINE__);					\
> +	___r;								\
> +})
> +#else
> +#define nvmet_fc_tgt_a_put(a) __nvmet_fc_tgt_a_put(a)
> +#define nvmet_fc_tgt_a_get(a) __nvmet_fc_tgt_a_get(a)
> +#endif
> +
> +static void __nvmet_fc_hostport_put(struct nvmet_fc_hostport *hostport);
> +static int __nvmet_fc_hostport_get(struct nvmet_fc_hostport *hostport);
> +
> +#if 0
> +#define nvmet_fc_hostport_put(p)					\
> +({									\
> +	pr_info("nvmet_fc_hostport_put: %p %d %s:%d\n",			\
> +		p, atomic_read(&p->ref.refcount.refs),			\
> +		__func__, __LINE__);					\
> +	__nvmet_fc_hostport_put(p);					\
> +})
> +
> +#define nvmet_fc_hostport_get(p)					\
> +({									\
> +	int ___r = __nvmet_fc_hostport_get(p);				\
> +									\
> +	pr_info("nvmet_fc_hostport_get: %p %d %s:%d\n",			\
> +		p, atomic_read(&p->ref.refcount.refs),			\
> +		__func__, __LINE__);					\
> +	___r;								\
> +})
> +#else
> +#define nvmet_fc_hostport_put(p) __nvmet_fc_hostport_put(p)
> +#define nvmet_fc_hostport_get(p) __nvmet_fc_hostport_get(p)
> +#endif
> +
> +static void __nvmet_fc_tgt_q_put(struct nvmet_fc_tgt_queue *queue);
> +static int __nvmet_fc_tgt_q_get(struct nvmet_fc_tgt_queue *queue);
> +
> +#if 0
If '0'?
Please, no.
Use dynamic debugging.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Ivo Totev, Andrew McDonald,
Werner Knoblich


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 17/17] nvmet-fc: add extensive debug logging
  2023-12-19 11:54   ` Hannes Reinecke
@ 2023-12-19 14:06     ` Daniel Wagner
  0 siblings, 0 replies; 52+ messages in thread
From: Daniel Wagner @ 2023-12-19 14:06 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: linux-nvme, linux-kernel, Christoph Hellwig, Sagi Grimberg,
	Keith Busch, James Smart

On Tue, Dec 19, 2023 at 12:54:12PM +0100, Hannes Reinecke wrote:
> On 12/18/23 16:31, Daniel Wagner wrote:
> > Signed-off-by: Daniel Wagner <dwagner@suse.de>
> 
> Well ... all previous patches had 'XX/16', only this has '17/17'.
> Late addendum?
> But in either case, please ensure correct counting (and spelling!)
> on the next submission.

Sorry, this patch was not meant to go out. It's just debuggin code for
me.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 01/16] nvmet: report ioccsz and iorcsz for disc ctrl
  2023-12-19  7:24   ` Hannes Reinecke
@ 2023-12-19 15:15     ` Daniel Wagner
  0 siblings, 0 replies; 52+ messages in thread
From: Daniel Wagner @ 2023-12-19 15:15 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: linux-nvme, linux-kernel, Christoph Hellwig, Sagi Grimberg,
	Keith Busch, James Smart

On Tue, Dec 19, 2023 at 08:24:58AM +0100, Hannes Reinecke wrote:
> > +	 * Disable in-capsule data for Metadata capable controllers.
> > +	 */
> > +	cmd_capsule_size = sizeof(struct nvme_command);
> > +	if (!ctrl->pi_support)
> > +		cmd_capsule_size += req->port->inline_data_size;
> > +	id->ioccsz = cpu_to_le32(cmd_capsule_size / 16);
> 
> Why the division by 16?

The unit size is 16 bytes:

  I/O Queue Command Capsule Supported Size (IOCCSZ): This field defines the
  maximum I/O command capsule size in 16 byte units.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 01/16] nvmet: report ioccsz and iorcsz for disc ctrl
  2023-12-19  4:27   ` Christoph Hellwig
@ 2023-12-20  0:50     ` Max Gurtovoy
  0 siblings, 0 replies; 52+ messages in thread
From: Max Gurtovoy @ 2023-12-20  0:50 UTC (permalink / raw)
  To: Christoph Hellwig, Daniel Wagner
  Cc: linux-nvme, linux-kernel, Sagi Grimberg, Keith Busch,
	James Smart, Hannes Reinecke



On 19/12/2023 6:27, Christoph Hellwig wrote:
> On Mon, Dec 18, 2023 at 04:30:49PM +0100, Daniel Wagner wrote:
>> +	/*
>> +	 * Max command capsule size is sqe + in-capsule data size.
>> +	 * Disable in-capsule data for Metadata capable controllers.
> 
> A why on the disable would be useful here - the fact that it is disabled
> is pretty obvious from the code.
> 

We have another thread on this patch that I think is wrong since the 
discovery controller shouldn't set these values according to the spec. 
It is explicitly say that ioccsz and iorcsz are reserved for discovery 
controllers.

I've sent a proposal to fix the initiator/host code.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 08/16] nvmet-fc: untangle cross refcounting objects
  2023-12-19  5:16   ` Christoph Hellwig
@ 2023-12-21  9:15     ` Daniel Wagner
  2024-01-26 15:32       ` Daniel Wagner
  0 siblings, 1 reply; 52+ messages in thread
From: Daniel Wagner @ 2023-12-21  9:15 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-nvme, linux-kernel, Sagi Grimberg, Keith Busch,
	James Smart, Hannes Reinecke

On Tue, Dec 19, 2023 at 06:16:48AM +0100, Christoph Hellwig wrote:
> On Mon, Dec 18, 2023 at 04:30:56PM +0100, Daniel Wagner wrote:
> > The live time of the queues are strictly bound to the lifetime of an
> 
> > +	struct nvmet_fc_tgt_queue	*_queues[NVMET_NR_QUEUES + 1];
> >  	struct nvmet_fc_tgt_queue __rcu	*queues[NVMET_NR_QUEUES + 1];
> 
> For magic prefixes we use __, not _ in Linux.  But having two arrays
> of queues right next to each other, once with rcu annotation and one
> not rings a bit far warning bell to me.  Why do we have both?  When
> are we supposed to use either?  Why is FC different from rest?

This is my attempt to solve the problem that after NULLing the rcu
pointer and wait for an RCU graceperiod I still need to cleanup the
queues. Thus I need to keep hold on the queue pointers a bit longer.
Indeed not so elegant.

I'm sure there is a better way to do it, I just didn't figure it out
when I wrote this part. Any tips are highly welcomed how to solve this
puzzle.

> I really don't have any good answers as I don't know the code in the
> FC transport very well, but I think this needs more work.

Indeed, blktests still is able to run into a hanger. So not all problems
are sovled but getting closer.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 00/17] enable nvmet-fc for blktests
  2023-12-18 15:30 [PATCH v3 00/17] enable nvmet-fc for blktests Daniel Wagner
                   ` (17 preceding siblings ...)
  2023-12-18 16:10 ` [PATCH v3 00/17] enable nvmet-fc for blktests Maurizio Lombardi
@ 2024-01-02 21:36 ` Keith Busch
  18 siblings, 0 replies; 52+ messages in thread
From: Keith Busch @ 2024-01-02 21:36 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: linux-nvme, linux-kernel, Christoph Hellwig, Sagi Grimberg,
	James Smart, Hannes Reinecke

On Mon, Dec 18, 2023 at 04:30:48PM +0100, Daniel Wagner wrote:
> Another update on getting nvmet-fc ready for blktests. The main change here is
> that I tried make sense of the ref count taking in nvmet-fc. When running
> blktests with the auto connect udev rule activated the additional connect
> attempt etc made nvmet-fc explode and choke everywhere. After a lot of poking
> and pondering I decided to change the rules who the ref counts are taken for the
> ctrl, association, target port and host port. This made a big difference and I
> am able to get blktests pass the tests.
> 
> Also KASAN was reporting a lot of UAFs. There are still some problems left as I
> can still observe hangers when running blktests in a loop for a while. But it
> doesn't explode immediately so I consider this a win.
> 
> Apropos KASAN, it still reports the problem from [1], so anyone who want to run
> this series needs to revert ee6fdc5055e9 ("nvme-fc: fix race between error
> recovery and creating association").
> 
> The first four patches are independent of the rest and could go in sooner.

Applied patches 2-5 to nvme-6.8.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 06/16] nvme-fc: Do not wait in vain when unloading module
  2023-12-19  4:35   ` Christoph Hellwig
@ 2024-01-05  5:05     ` Keith Busch
  0 siblings, 0 replies; 52+ messages in thread
From: Keith Busch @ 2024-01-05  5:05 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Daniel Wagner, linux-nvme, linux-kernel, Sagi Grimberg,
	James Smart, Hannes Reinecke

On Tue, Dec 19, 2023 at 05:35:14AM +0100, Christoph Hellwig wrote:
> On Mon, Dec 18, 2023 at 04:30:54PM +0100, Daniel Wagner wrote:
> > The module unload code will wait for a controller to be delete even when
> > there is no controller and we wait for completion forever to happen.
> > Thus only wait for the completion when there is a controller which
> > needs to be removed.
> 
> This whole code looks fishy to me, and I suspect this patch only papers
> over it.  Why do we this wait to start with?  If we've found that out and
> documented it, the code really should be using a wait_event variant that
> checks for the actual condition (no more controllers), because without
> that you might still have a race otherwise.

The synchronization here does feel off, but Daniel's change looks
correct to the current implementation and is a minimal diff to fix it.
Do you want to see this re-worked with a better wait condition or can we
proceed with this?

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 08/16] nvmet-fc: untangle cross refcounting objects
  2023-12-21  9:15     ` Daniel Wagner
@ 2024-01-26 15:32       ` Daniel Wagner
  0 siblings, 0 replies; 52+ messages in thread
From: Daniel Wagner @ 2024-01-26 15:32 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-nvme, linux-kernel, Sagi Grimberg, Keith Busch,
	James Smart, Hannes Reinecke

On Thu, Dec 21, 2023 at 10:17:30AM +0100, Daniel Wagner wrote:
> On Tue, Dec 19, 2023 at 06:16:48AM +0100, Christoph Hellwig wrote:
> > On Mon, Dec 18, 2023 at 04:30:56PM +0100, Daniel Wagner wrote:
> > > The live time of the queues are strictly bound to the lifetime of an
> > 
> > > +	struct nvmet_fc_tgt_queue	*_queues[NVMET_NR_QUEUES + 1];
> > >  	struct nvmet_fc_tgt_queue __rcu	*queues[NVMET_NR_QUEUES + 1];
> > 
> > For magic prefixes we use __, not _ in Linux.  But having two arrays
> > of queues right next to each other, once with rcu annotation and one
> > not rings a bit far warning bell to me.  Why do we have both?  When
> > are we supposed to use either?  Why is FC different from rest?
> 
> This is my attempt to solve the problem that after NULLing the rcu
> pointer and wait for an RCU graceperiod I still need to cleanup the
> queues. Thus I need to keep hold on the queue pointers a bit longer.
> Indeed not so elegant.
> 
> I'm sure there is a better way to do it, I just didn't figure it out
> when I wrote this part. Any tips are highly welcomed how to solve this
> puzzle.

Looking at this code again, I don't think we need use RCU for the queues
pointers at all. The association is already under RCU and thus the
queues pointers don't need additional RCUing. We either can lookup the
association or not and the queue pointer's lifetime is under kref rules.
So with this in mind the lookup would be:

	rcu_read_lock();
	list_for_each_entry_rcu(assoc, &tgtport->assoc_list, a_list) {
		if (association_id == assoc->association_id) {
			queue = assoc->queues[qid];
			if (queue &&
			    (!atomic_read(&queue->connected) ||
			     !nvmet_fc_tgt_q_get(queue)))
				queue = NULL;
			rcu_read_unlock();
			return queue;
		}
	}
	rcu_read_unlock();
	return NULL;

No need for more complexity :)

^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2024-01-26 15:32 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-18 15:30 [PATCH v3 00/17] enable nvmet-fc for blktests Daniel Wagner
2023-12-18 15:30 ` [PATCH v3 01/16] nvmet: report ioccsz and iorcsz for disc ctrl Daniel Wagner
2023-12-19  4:27   ` Christoph Hellwig
2023-12-20  0:50     ` Max Gurtovoy
2023-12-19  7:24   ` Hannes Reinecke
2023-12-19 15:15     ` Daniel Wagner
2023-12-18 15:30 ` [PATCH v3 02/16] nvmet-fc: remove unnecessary bracket Daniel Wagner
2023-12-19  4:27   ` Christoph Hellwig
2023-12-19  7:25   ` Hannes Reinecke
2023-12-18 15:30 ` [PATCH v3 03/16] nvmet-trace: avoid dereferencing pointer too early Daniel Wagner
2023-12-19  4:29   ` Christoph Hellwig
2023-12-18 15:30 ` [PATCH v3 04/16] nvmet-trace: null terminate device name string correctly Daniel Wagner
2023-12-19  4:29   ` Christoph Hellwig
2023-12-18 15:30 ` [PATCH v3 05/16] nvmet-fcloop: Remove remote port from list when unlinking Daniel Wagner
2023-12-19  4:31   ` Christoph Hellwig
2023-12-19  7:26   ` Hannes Reinecke
2023-12-18 15:30 ` [PATCH v3 06/16] nvme-fc: Do not wait in vain when unloading module Daniel Wagner
2023-12-19  4:35   ` Christoph Hellwig
2024-01-05  5:05     ` Keith Busch
2023-12-19  7:28   ` Hannes Reinecke
2023-12-18 15:30 ` [PATCH v3 07/16] nvmet-fc: Release reference on target port Daniel Wagner
2023-12-19  4:36   ` Christoph Hellwig
2023-12-19  7:28   ` Hannes Reinecke
2023-12-18 15:30 ` [PATCH v3 08/16] nvmet-fc: untangle cross refcounting objects Daniel Wagner
2023-12-19  5:16   ` Christoph Hellwig
2023-12-21  9:15     ` Daniel Wagner
2024-01-26 15:32       ` Daniel Wagner
2023-12-19  7:59   ` Hannes Reinecke
2023-12-18 15:30 ` [PATCH v3 09/16] nvmet-fc: free queue and assoc directly Daniel Wagner
2023-12-19  7:59   ` Hannes Reinecke
2023-12-18 15:30 ` [PATCH v3 10/16] nvmet-fc: hold reference on hostport match Daniel Wagner
2023-12-19  8:00   ` Hannes Reinecke
2023-12-18 15:30 ` [PATCH v3 11/16] nvmet-fc: remove null hostport pointer check Daniel Wagner
2023-12-19 11:37   ` Hannes Reinecke
2023-12-18 15:31 ` [PATCH v3 12/16] nvmet-fc: do not tack refs on tgtports from assoc Daniel Wagner
2023-12-19 11:38   ` Hannes Reinecke
2023-12-18 15:31 ` [PATCH v3 13/16] nvmet-fc: abort command if when there is binding Daniel Wagner
2023-12-19 11:39   ` Hannes Reinecke
2023-12-18 15:31 ` [PATCH v3 14/16] nvmet-fc: free hostport after release reference to tgtport Daniel Wagner
2023-12-19 11:41   ` Hannes Reinecke
2023-12-18 15:31 ` [PATCH v3 15/16] nvmet-fc: avoid deadlock on delete association path Daniel Wagner
2023-12-19  6:00   ` kernel test robot
2023-12-19  8:29   ` kernel test robot
2023-12-19 11:17   ` kernel test robot
2023-12-19 11:43   ` Hannes Reinecke
2023-12-18 15:31 ` [PATCH v3 16/16] nvmet-fc: take ref count on tgtport before delete assoc Daniel Wagner
2023-12-19 11:47   ` Hannes Reinecke
2023-12-18 15:31 ` [PATCH v3 17/17] nvmet-fc: add extensive debug logging Daniel Wagner
2023-12-19 11:54   ` Hannes Reinecke
2023-12-19 14:06     ` Daniel Wagner
2023-12-18 16:10 ` [PATCH v3 00/17] enable nvmet-fc for blktests Maurizio Lombardi
2024-01-02 21:36 ` Keith Busch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).