All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/16] nvme: rework dhchap authentication host code
@ 2022-11-09  3:44 Sagi Grimberg
  2022-11-09  3:44 ` [PATCH 01/16] nvme-auth: rename __nvme_auth_[reset|free] to nvme_auth[reset|free]_dhchap Sagi Grimberg
                   ` (18 more replies)
  0 siblings, 19 replies; 43+ messages in thread
From: Sagi Grimberg @ 2022-11-09  3:44 UTC (permalink / raw)
  To: linux-nvme
  Cc: Christoph Hellwig, Hannes Reinecke, Keith Busch, Chaitanya Kulkarni

Currently the authentication code is fairly fragile with respect to
mutual locking between secrets sysfs override, re-authentication, and
controller resets.

This patch set attempts to resolve these issues by:
1. freeing queue chap context as soon as authentication completes
2. allocates a simple vector for queue chap contexts so there is
no list/tree traversal to resolve queue chap context. queue chap
contexts are 1x1 mapped to queues, which are stored in a vector as
well.
3. flush chap auth_work from the ctrl dhchap work, this simplifies
how we flush inflight authentication sequence
4. use ctrl dhchap_auth_mutex to protect only the resources that are
accessed and modified via sysfs and the authentication flow (i.e. ctrl
host_key and ctrl_key)
5. move drivers (rdma/tcp) nvme_auth_stop later in the error recovery
flow to expedite failover and not block on I/O.


Feedback is welcome.

Sagi Grimberg (16):
  nvme-auth: rename __nvme_auth_[reset|free] to
    nvme_auth[reset|free]_dhchap
  nvme-auth: remove symbol export from nvme_auth_reset
  nvme-auth: don't re-authenticate if the controller is not LIVE
  nvme-auth: remove redundant buffer deallocations
  nvme-auth: don't ignore key generation failures when initializing ctrl
    keys
  nvme-auth: don't override ctrl keys before validation
  nvme-auth: remove redundant if statement
  nvme-auth: don't keep long lived 4k dhchap buffer
  nvme-auth: clear sensitive info right after authentication completes
  nvme-auth: remove redundant deallocations
  nvme-auth: no need to reset chap contexts on re-authentication
  nvme-auth: convert dhchap_auth_list to an array
  nvme-auth: remove redundant auth_work flush
  nvme-auth: have dhchap_auth_work wait for queues auth to complete
  nvme-tcp: stop auth work after tearing down queues in error recovery
  nvme-rdma: stop auth work after tearing down queues in error recovery

 drivers/nvme/host/auth.c | 209 ++++++++++++++++++---------------------
 drivers/nvme/host/core.c |  26 +++--
 drivers/nvme/host/nvme.h |   5 +-
 drivers/nvme/host/rdma.c |   2 +-
 drivers/nvme/host/tcp.c  |   2 +-
 5 files changed, 121 insertions(+), 123 deletions(-)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 01/16] nvme-auth: rename __nvme_auth_[reset|free] to nvme_auth[reset|free]_dhchap
  2022-11-09  3:44 [PATCH 00/16] nvme: rework dhchap authentication host code Sagi Grimberg
@ 2022-11-09  3:44 ` Sagi Grimberg
  2022-11-09 17:35   ` Hannes Reinecke
  2022-11-09  3:44 ` [PATCH 02/16] nvme-auth: remove symbol export from nvme_auth_reset Sagi Grimberg
                   ` (17 subsequent siblings)
  18 siblings, 1 reply; 43+ messages in thread
From: Sagi Grimberg @ 2022-11-09  3:44 UTC (permalink / raw)
  To: linux-nvme
  Cc: Christoph Hellwig, Hannes Reinecke, Keith Busch, Chaitanya Kulkarni

nvme_auth_[reset|free] operate on the controller while
__nvme_auth_[reset|free] operate on a chap struct (which maps to a queue
context). Rename it for clarity.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/auth.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
index c8a6db7c4498..d45333268fcf 100644
--- a/drivers/nvme/host/auth.c
+++ b/drivers/nvme/host/auth.c
@@ -654,7 +654,7 @@ static int nvme_auth_dhchap_exponential(struct nvme_ctrl *ctrl,
 	return 0;
 }
 
-static void __nvme_auth_reset(struct nvme_dhchap_queue_context *chap)
+static void nvme_auth_reset_dhchap(struct nvme_dhchap_queue_context *chap)
 {
 	kfree_sensitive(chap->host_response);
 	chap->host_response = NULL;
@@ -676,9 +676,9 @@ static void __nvme_auth_reset(struct nvme_dhchap_queue_context *chap)
 	memset(chap->c2, 0, sizeof(chap->c2));
 }
 
-static void __nvme_auth_free(struct nvme_dhchap_queue_context *chap)
+static void nvme_auth_free_dhchap(struct nvme_dhchap_queue_context *chap)
 {
-	__nvme_auth_reset(chap);
+	nvme_auth_reset_dhchap(chap);
 	if (chap->shash_tfm)
 		crypto_free_shash(chap->shash_tfm);
 	if (chap->dh_tfm)
@@ -868,7 +868,7 @@ int nvme_auth_negotiate(struct nvme_ctrl *ctrl, int qid)
 			dev_dbg(ctrl->device, "qid %d: re-using context\n", qid);
 			mutex_unlock(&ctrl->dhchap_auth_mutex);
 			flush_work(&chap->auth_work);
-			__nvme_auth_reset(chap);
+			nvme_auth_reset_dhchap(chap);
 			queue_work(nvme_wq, &chap->auth_work);
 			return 0;
 		}
@@ -928,7 +928,7 @@ void nvme_auth_reset(struct nvme_ctrl *ctrl)
 	list_for_each_entry(chap, &ctrl->dhchap_auth_list, entry) {
 		mutex_unlock(&ctrl->dhchap_auth_mutex);
 		flush_work(&chap->auth_work);
-		__nvme_auth_reset(chap);
+		nvme_auth_reset_dhchap(chap);
 	}
 	mutex_unlock(&ctrl->dhchap_auth_mutex);
 }
@@ -1002,7 +1002,7 @@ void nvme_auth_free(struct nvme_ctrl *ctrl)
 	list_for_each_entry_safe(chap, tmp, &ctrl->dhchap_auth_list, entry) {
 		list_del_init(&chap->entry);
 		flush_work(&chap->auth_work);
-		__nvme_auth_free(chap);
+		nvme_auth_free_dhchap(chap);
 	}
 	mutex_unlock(&ctrl->dhchap_auth_mutex);
 	if (ctrl->host_key) {
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 02/16] nvme-auth: remove symbol export from nvme_auth_reset
  2022-11-09  3:44 [PATCH 00/16] nvme: rework dhchap authentication host code Sagi Grimberg
  2022-11-09  3:44 ` [PATCH 01/16] nvme-auth: rename __nvme_auth_[reset|free] to nvme_auth[reset|free]_dhchap Sagi Grimberg
@ 2022-11-09  3:44 ` Sagi Grimberg
  2022-11-09  7:30   ` Hannes Reinecke
  2022-11-09  3:44 ` [PATCH 03/16] nvme-auth: don't re-authenticate if the controller is not LIVE Sagi Grimberg
                   ` (16 subsequent siblings)
  18 siblings, 1 reply; 43+ messages in thread
From: Sagi Grimberg @ 2022-11-09  3:44 UTC (permalink / raw)
  To: linux-nvme
  Cc: Christoph Hellwig, Hannes Reinecke, Keith Busch, Chaitanya Kulkarni

Only the nvme module calls it.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/auth.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
index d45333268fcf..734928282d3e 100644
--- a/drivers/nvme/host/auth.c
+++ b/drivers/nvme/host/auth.c
@@ -932,7 +932,6 @@ void nvme_auth_reset(struct nvme_ctrl *ctrl)
 	}
 	mutex_unlock(&ctrl->dhchap_auth_mutex);
 }
-EXPORT_SYMBOL_GPL(nvme_auth_reset);
 
 static void nvme_dhchap_auth_work(struct work_struct *work)
 {
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 03/16] nvme-auth: don't re-authenticate if the controller is not LIVE
  2022-11-09  3:44 [PATCH 00/16] nvme: rework dhchap authentication host code Sagi Grimberg
  2022-11-09  3:44 ` [PATCH 01/16] nvme-auth: rename __nvme_auth_[reset|free] to nvme_auth[reset|free]_dhchap Sagi Grimberg
  2022-11-09  3:44 ` [PATCH 02/16] nvme-auth: remove symbol export from nvme_auth_reset Sagi Grimberg
@ 2022-11-09  3:44 ` Sagi Grimberg
  2022-11-09  7:33   ` Hannes Reinecke
  2022-11-09  3:44 ` [PATCH 04/16] nvme-auth: remove redundant buffer deallocations Sagi Grimberg
                   ` (15 subsequent siblings)
  18 siblings, 1 reply; 43+ messages in thread
From: Sagi Grimberg @ 2022-11-09  3:44 UTC (permalink / raw)
  To: linux-nvme
  Cc: Christoph Hellwig, Hannes Reinecke, Keith Busch, Chaitanya Kulkarni

The connect sequence will re-authenticate.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/auth.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
index 734928282d3e..93c0fc71bc7c 100644
--- a/drivers/nvme/host/auth.c
+++ b/drivers/nvme/host/auth.c
@@ -939,6 +939,13 @@ static void nvme_dhchap_auth_work(struct work_struct *work)
 		container_of(work, struct nvme_ctrl, dhchap_auth_work);
 	int ret, q;
 
+	/*
+	 * If the ctrl is no connected, bail as reconnect will handle
+	 * authentication.
+	 */
+	if (ctrl->state != NVME_CTRL_LIVE)
+		return;
+
 	/* Authenticate admin queue first */
 	ret = nvme_auth_negotiate(ctrl, 0);
 	if (ret) {
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 04/16] nvme-auth: remove redundant buffer deallocations
  2022-11-09  3:44 [PATCH 00/16] nvme: rework dhchap authentication host code Sagi Grimberg
                   ` (2 preceding siblings ...)
  2022-11-09  3:44 ` [PATCH 03/16] nvme-auth: don't re-authenticate if the controller is not LIVE Sagi Grimberg
@ 2022-11-09  3:44 ` Sagi Grimberg
  2022-11-09 17:36   ` Hannes Reinecke
  2022-11-09  3:44 ` [PATCH 05/16] nvme-auth: don't ignore key generation failures when initializing ctrl keys Sagi Grimberg
                   ` (14 subsequent siblings)
  18 siblings, 1 reply; 43+ messages in thread
From: Sagi Grimberg @ 2022-11-09  3:44 UTC (permalink / raw)
  To: linux-nvme
  Cc: Christoph Hellwig, Hannes Reinecke, Keith Busch, Chaitanya Kulkarni

host_response, host_key, ctrl_key and sess_key are
freed in nvme_auth_reset_dhchap which is called from
nvme_auth_free_dhchap.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/auth.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
index 93c0fc71bc7c..15cddc2bb14d 100644
--- a/drivers/nvme/host/auth.c
+++ b/drivers/nvme/host/auth.c
@@ -683,10 +683,6 @@ static void nvme_auth_free_dhchap(struct nvme_dhchap_queue_context *chap)
 		crypto_free_shash(chap->shash_tfm);
 	if (chap->dh_tfm)
 		crypto_free_kpp(chap->dh_tfm);
-	kfree_sensitive(chap->ctrl_key);
-	kfree_sensitive(chap->host_key);
-	kfree_sensitive(chap->sess_key);
-	kfree_sensitive(chap->host_response);
 	kfree(chap->buf);
 	kfree(chap);
 }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 05/16] nvme-auth: don't ignore key generation failures when initializing ctrl keys
  2022-11-09  3:44 [PATCH 00/16] nvme: rework dhchap authentication host code Sagi Grimberg
                   ` (3 preceding siblings ...)
  2022-11-09  3:44 ` [PATCH 04/16] nvme-auth: remove redundant buffer deallocations Sagi Grimberg
@ 2022-11-09  3:44 ` Sagi Grimberg
  2022-11-09  7:33   ` Hannes Reinecke
  2022-11-09  3:44 ` [PATCH 06/16] nvme-auth: don't override ctrl keys before validation Sagi Grimberg
                   ` (13 subsequent siblings)
  18 siblings, 1 reply; 43+ messages in thread
From: Sagi Grimberg @ 2022-11-09  3:44 UTC (permalink / raw)
  To: linux-nvme
  Cc: Christoph Hellwig, Hannes Reinecke, Keith Busch, Chaitanya Kulkarni

nvme_auth_generate_key can fail, don't ignore it upon initialization.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/auth.c | 19 +++++++++++++++----
 drivers/nvme/host/core.c |  6 +++++-
 drivers/nvme/host/nvme.h |  2 +-
 3 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
index 15cddc2bb14d..a6312a8846f8 100644
--- a/drivers/nvme/host/auth.c
+++ b/drivers/nvme/host/auth.c
@@ -972,15 +972,26 @@ static void nvme_dhchap_auth_work(struct work_struct *work)
 	 */
 }
 
-void nvme_auth_init_ctrl(struct nvme_ctrl *ctrl)
+int nvme_auth_init_ctrl(struct nvme_ctrl *ctrl)
 {
+	int ret;
+
 	INIT_LIST_HEAD(&ctrl->dhchap_auth_list);
 	INIT_WORK(&ctrl->dhchap_auth_work, nvme_dhchap_auth_work);
 	mutex_init(&ctrl->dhchap_auth_mutex);
 	if (!ctrl->opts)
-		return;
-	nvme_auth_generate_key(ctrl->opts->dhchap_secret, &ctrl->host_key);
-	nvme_auth_generate_key(ctrl->opts->dhchap_ctrl_secret, &ctrl->ctrl_key);
+		return 0;
+	ret = nvme_auth_generate_key(ctrl->opts->dhchap_secret,
+			&ctrl->host_key);
+	if (ret)
+		return ret;
+	ret = nvme_auth_generate_key(ctrl->opts->dhchap_ctrl_secret,
+			&ctrl->ctrl_key);
+	if (ret) {
+		nvme_auth_free_key(ctrl->host_key);
+		ctrl->host_key = NULL;
+	}
+	return ret;
 }
 EXPORT_SYMBOL_GPL(nvme_auth_init_ctrl);
 
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index dc4220600585..09f2b44fe771 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -5078,9 +5078,13 @@ int nvme_init_ctrl(struct nvme_ctrl *ctrl, struct device *dev,
 
 	nvme_fault_inject_init(&ctrl->fault_inject, dev_name(ctrl->device));
 	nvme_mpath_init_ctrl(ctrl);
-	nvme_auth_init_ctrl(ctrl);
+	ret = nvme_auth_init_ctrl(ctrl);
+	if (ret)
+		goto out_free_cdev;
 
 	return 0;
+out_free_cdev:
+	cdev_device_del(&ctrl->cdev, ctrl->device);
 out_free_name:
 	nvme_put_ctrl(ctrl);
 	kfree_const(ctrl->device->kobj.name);
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index a29877217ee6..cc314f45ddee 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -1019,7 +1019,7 @@ static inline bool nvme_ctrl_sgl_supported(struct nvme_ctrl *ctrl)
 }
 
 #ifdef CONFIG_NVME_AUTH
-void nvme_auth_init_ctrl(struct nvme_ctrl *ctrl);
+int nvme_auth_init_ctrl(struct nvme_ctrl *ctrl);
 void nvme_auth_stop(struct nvme_ctrl *ctrl);
 int nvme_auth_negotiate(struct nvme_ctrl *ctrl, int qid);
 int nvme_auth_wait(struct nvme_ctrl *ctrl, int qid);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 06/16] nvme-auth: don't override ctrl keys before validation
  2022-11-09  3:44 [PATCH 00/16] nvme: rework dhchap authentication host code Sagi Grimberg
                   ` (4 preceding siblings ...)
  2022-11-09  3:44 ` [PATCH 05/16] nvme-auth: don't ignore key generation failures when initializing ctrl keys Sagi Grimberg
@ 2022-11-09  3:44 ` Sagi Grimberg
  2022-11-09  7:34   ` Hannes Reinecke
  2022-11-09  3:44 ` [PATCH 07/16] nvme-auth: remove redundant if statement Sagi Grimberg
                   ` (12 subsequent siblings)
  18 siblings, 1 reply; 43+ messages in thread
From: Sagi Grimberg @ 2022-11-09  3:44 UTC (permalink / raw)
  To: linux-nvme
  Cc: Christoph Hellwig, Hannes Reinecke, Keith Busch, Chaitanya Kulkarni

Replace ctrl ctrl_key/host_key only after nvme_auth_generate_key is successful.
Also, this fixes a bug where the keys are leaked.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/core.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 09f2b44fe771..4aa855a4839a 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3746,13 +3746,17 @@ static ssize_t nvme_ctrl_dhchap_secret_store(struct device *dev,
 	memcpy(dhchap_secret, buf, count);
 	nvme_auth_stop(ctrl);
 	if (strcmp(dhchap_secret, opts->dhchap_secret)) {
+		struct nvme_dhchap_key *key, *host_key;
 		int ret;
 
-		ret = nvme_auth_generate_key(dhchap_secret, &ctrl->host_key);
+		ret = nvme_auth_generate_key(dhchap_secret, &key);
 		if (ret)
 			return ret;
 		kfree(opts->dhchap_secret);
 		opts->dhchap_secret = dhchap_secret;
+		host_key = ctrl->host_key;
+		ctrl->host_key = key;
+		nvme_auth_free_key(host_key);
 		/* Key has changed; re-authentication with new key */
 		nvme_auth_reset(ctrl);
 	}
@@ -3796,13 +3800,17 @@ static ssize_t nvme_ctrl_dhchap_ctrl_secret_store(struct device *dev,
 	memcpy(dhchap_secret, buf, count);
 	nvme_auth_stop(ctrl);
 	if (strcmp(dhchap_secret, opts->dhchap_ctrl_secret)) {
+		struct nvme_dhchap_key *key, *ctrl_key;
 		int ret;
 
-		ret = nvme_auth_generate_key(dhchap_secret, &ctrl->ctrl_key);
+		ret = nvme_auth_generate_key(dhchap_secret, &key);
 		if (ret)
 			return ret;
 		kfree(opts->dhchap_ctrl_secret);
 		opts->dhchap_ctrl_secret = dhchap_secret;
+		ctrl_key = ctrl->ctrl_key;
+		ctrl->ctrl_key = key;
+		nvme_auth_free_key(ctrl_key);
 		/* Key has changed; re-authentication with new key */
 		nvme_auth_reset(ctrl);
 	}
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 07/16] nvme-auth: remove redundant if statement
  2022-11-09  3:44 [PATCH 00/16] nvme: rework dhchap authentication host code Sagi Grimberg
                   ` (5 preceding siblings ...)
  2022-11-09  3:44 ` [PATCH 06/16] nvme-auth: don't override ctrl keys before validation Sagi Grimberg
@ 2022-11-09  3:44 ` Sagi Grimberg
  2022-11-09  6:37   ` Christoph Hellwig
  2022-11-09  7:34   ` Hannes Reinecke
  2022-11-09  3:44 ` [PATCH 08/16] nvme-auth: don't keep long lived 4k dhchap buffer Sagi Grimberg
                   ` (11 subsequent siblings)
  18 siblings, 2 replies; 43+ messages in thread
From: Sagi Grimberg @ 2022-11-09  3:44 UTC (permalink / raw)
  To: linux-nvme
  Cc: Christoph Hellwig, Hannes Reinecke, Keith Busch, Chaitanya Kulkarni

No one passes NVME_QID_ANY to nvme_auth_negotiate.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/auth.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
index a6312a8846f8..07d497f49e4b 100644
--- a/drivers/nvme/host/auth.c
+++ b/drivers/nvme/host/auth.c
@@ -874,7 +874,7 @@ int nvme_auth_negotiate(struct nvme_ctrl *ctrl, int qid)
 		mutex_unlock(&ctrl->dhchap_auth_mutex);
 		return -ENOMEM;
 	}
-	chap->qid = (qid == NVME_QID_ANY) ? 0 : qid;
+	chap->qid = qid;
 	chap->ctrl = ctrl;
 
 	/*
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 08/16] nvme-auth: don't keep long lived 4k dhchap buffer
  2022-11-09  3:44 [PATCH 00/16] nvme: rework dhchap authentication host code Sagi Grimberg
                   ` (6 preceding siblings ...)
  2022-11-09  3:44 ` [PATCH 07/16] nvme-auth: remove redundant if statement Sagi Grimberg
@ 2022-11-09  3:44 ` Sagi Grimberg
  2022-11-09  6:39   ` Christoph Hellwig
  2022-11-09 17:37   ` Hannes Reinecke
  2022-11-09  3:44 ` [PATCH 09/16] nvme-auth: clear sensitive info right after authentication completes Sagi Grimberg
                   ` (10 subsequent siblings)
  18 siblings, 2 replies; 43+ messages in thread
From: Sagi Grimberg @ 2022-11-09  3:44 UTC (permalink / raw)
  To: linux-nvme
  Cc: Christoph Hellwig, Hannes Reinecke, Keith Busch, Chaitanya Kulkarni

dhchap structure is per-queue, it is wasteful to keep it for the entire
lifetime of the queue. Allocate it dynamically and get rid of it after
authentication. We don't need kzalloc because all accessors are clearing
it before writing to it.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/auth.c | 27 +++++++++++++--------------
 1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
index 07d497f49e4b..201f25267685 100644
--- a/drivers/nvme/host/auth.c
+++ b/drivers/nvme/host/auth.c
@@ -674,6 +674,8 @@ static void nvme_auth_reset_dhchap(struct nvme_dhchap_queue_context *chap)
 	chap->transaction = 0;
 	memset(chap->c1, 0, sizeof(chap->c1));
 	memset(chap->c2, 0, sizeof(chap->c2));
+	kfree(chap->buf);
+	chap->buf = NULL;
 }
 
 static void nvme_auth_free_dhchap(struct nvme_dhchap_queue_context *chap)
@@ -683,7 +685,6 @@ static void nvme_auth_free_dhchap(struct nvme_dhchap_queue_context *chap)
 		crypto_free_shash(chap->shash_tfm);
 	if (chap->dh_tfm)
 		crypto_free_kpp(chap->dh_tfm);
-	kfree(chap->buf);
 	kfree(chap);
 }
 
@@ -695,6 +696,17 @@ static void __nvme_auth_work(struct work_struct *work)
 	size_t tl;
 	int ret = 0;
 
+	/*
+	 * Allocate a large enough buffer for the entire negotiation:
+	 * 4k should be enough to ffdhe8192.
+	 */
+	chap->buf_size = 4096;
+	chap->buf = kmalloc(chap->buf_size, GFP_KERNEL);
+	if (!chap->buf) {
+		chap->error = -ENOMEM;
+		return;
+	}
+
 	chap->transaction = ctrl->transaction++;
 
 	/* DH-HMAC-CHAP Step 1: send negotiate */
@@ -876,19 +888,6 @@ int nvme_auth_negotiate(struct nvme_ctrl *ctrl, int qid)
 	}
 	chap->qid = qid;
 	chap->ctrl = ctrl;
-
-	/*
-	 * Allocate a large enough buffer for the entire negotiation:
-	 * 4k should be enough to ffdhe8192.
-	 */
-	chap->buf_size = 4096;
-	chap->buf = kzalloc(chap->buf_size, GFP_KERNEL);
-	if (!chap->buf) {
-		mutex_unlock(&ctrl->dhchap_auth_mutex);
-		kfree(chap);
-		return -ENOMEM;
-	}
-
 	INIT_WORK(&chap->auth_work, __nvme_auth_work);
 	list_add(&chap->entry, &ctrl->dhchap_auth_list);
 	mutex_unlock(&ctrl->dhchap_auth_mutex);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 09/16] nvme-auth: clear sensitive info right after authentication completes
  2022-11-09  3:44 [PATCH 00/16] nvme: rework dhchap authentication host code Sagi Grimberg
                   ` (7 preceding siblings ...)
  2022-11-09  3:44 ` [PATCH 08/16] nvme-auth: don't keep long lived 4k dhchap buffer Sagi Grimberg
@ 2022-11-09  3:44 ` Sagi Grimberg
  2022-11-09  7:35   ` Hannes Reinecke
  2022-11-09  3:44 ` [PATCH 10/16] nvme-auth: remove redundant deallocations Sagi Grimberg
                   ` (9 subsequent siblings)
  18 siblings, 1 reply; 43+ messages in thread
From: Sagi Grimberg @ 2022-11-09  3:44 UTC (permalink / raw)
  To: linux-nvme
  Cc: Christoph Hellwig, Hannes Reinecke, Keith Busch, Chaitanya Kulkarni

We don't want to keep authentication sensitive info in memory for unlimited
amount of time.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/auth.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
index 201f25267685..484315efa0b2 100644
--- a/drivers/nvme/host/auth.c
+++ b/drivers/nvme/host/auth.c
@@ -908,6 +908,8 @@ int nvme_auth_wait(struct nvme_ctrl *ctrl, int qid)
 		mutex_unlock(&ctrl->dhchap_auth_mutex);
 		flush_work(&chap->auth_work);
 		ret = chap->error;
+		/* clear sensitive info */
+		nvme_auth_reset_dhchap(chap);
 		return ret;
 	}
 	mutex_unlock(&ctrl->dhchap_auth_mutex);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 10/16] nvme-auth: remove redundant deallocations
  2022-11-09  3:44 [PATCH 00/16] nvme: rework dhchap authentication host code Sagi Grimberg
                   ` (8 preceding siblings ...)
  2022-11-09  3:44 ` [PATCH 09/16] nvme-auth: clear sensitive info right after authentication completes Sagi Grimberg
@ 2022-11-09  3:44 ` Sagi Grimberg
  2022-11-09  3:44 ` [PATCH 11/16] nvme-auth: no need to reset chap contexts on re-authentication Sagi Grimberg
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 43+ messages in thread
From: Sagi Grimberg @ 2022-11-09  3:44 UTC (permalink / raw)
  To: linux-nvme
  Cc: Christoph Hellwig, Hannes Reinecke, Keith Busch, Chaitanya Kulkarni

These are now redundant as the dhchap context is
removed after authentication completes.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/auth.c | 20 --------------------
 1 file changed, 20 deletions(-)

diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
index 484315efa0b2..9218f50d8be3 100644
--- a/drivers/nvme/host/auth.c
+++ b/drivers/nvme/host/auth.c
@@ -197,12 +197,6 @@ static int nvme_auth_process_dhchap_challenge(struct nvme_ctrl *ctrl,
 		return NVME_SC_AUTH_REQUIRED;
 	}
 
-	/* Reset host response if the hash had been changed */
-	if (chap->hash_id != data->hashid) {
-		kfree(chap->host_response);
-		chap->host_response = NULL;
-	}
-
 	chap->hash_id = data->hashid;
 	chap->hash_len = data->hl;
 	dev_dbg(ctrl->device, "qid %d: selected hash %s\n",
@@ -219,14 +213,6 @@ static int nvme_auth_process_dhchap_challenge(struct nvme_ctrl *ctrl,
 		return NVME_SC_AUTH_REQUIRED;
 	}
 
-	/* Clear host and controller key to avoid accidental reuse */
-	kfree_sensitive(chap->host_key);
-	chap->host_key = NULL;
-	chap->host_key_len = 0;
-	kfree_sensitive(chap->ctrl_key);
-	chap->ctrl_key = NULL;
-	chap->ctrl_key_len = 0;
-
 	if (chap->dhgroup_id == data->dhgid &&
 	    (data->dhgid == NVME_AUTH_DHGROUP_NULL || chap->dh_tfm)) {
 		dev_dbg(ctrl->device,
@@ -621,9 +607,6 @@ static int nvme_auth_dhchap_exponential(struct nvme_ctrl *ctrl,
 	if (ret) {
 		dev_dbg(ctrl->device,
 			"failed to generate public key, error %d\n", ret);
-		kfree(chap->host_key);
-		chap->host_key = NULL;
-		chap->host_key_len = 0;
 		chap->status = NVME_AUTH_DHCHAP_FAILURE_INCORRECT_PAYLOAD;
 		return ret;
 	}
@@ -643,9 +626,6 @@ static int nvme_auth_dhchap_exponential(struct nvme_ctrl *ctrl,
 	if (ret) {
 		dev_dbg(ctrl->device,
 			"failed to generate shared secret, error %d\n", ret);
-		kfree_sensitive(chap->sess_key);
-		chap->sess_key = NULL;
-		chap->sess_key_len = 0;
 		chap->status = NVME_AUTH_DHCHAP_FAILURE_INCORRECT_PAYLOAD;
 		return ret;
 	}
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 11/16] nvme-auth: no need to reset chap contexts on re-authentication
  2022-11-09  3:44 [PATCH 00/16] nvme: rework dhchap authentication host code Sagi Grimberg
                   ` (9 preceding siblings ...)
  2022-11-09  3:44 ` [PATCH 10/16] nvme-auth: remove redundant deallocations Sagi Grimberg
@ 2022-11-09  3:44 ` Sagi Grimberg
  2022-11-09  7:36   ` Hannes Reinecke
  2022-11-09  3:44 ` [PATCH 12/16] nvme-auth: convert dhchap_auth_list to an array Sagi Grimberg
                   ` (7 subsequent siblings)
  18 siblings, 1 reply; 43+ messages in thread
From: Sagi Grimberg @ 2022-11-09  3:44 UTC (permalink / raw)
  To: linux-nvme
  Cc: Christoph Hellwig, Hannes Reinecke, Keith Busch, Chaitanya Kulkarni

Now that the chap context is reset upon completion, this is no longer
needed. Also remove nvme_auth_reset as no callers are left.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/auth.c | 13 -------------
 drivers/nvme/host/core.c |  4 ----
 drivers/nvme/host/nvme.h |  1 -
 3 files changed, 18 deletions(-)

diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
index 9218f50d8be3..23d486284da9 100644
--- a/drivers/nvme/host/auth.c
+++ b/drivers/nvme/host/auth.c
@@ -897,19 +897,6 @@ int nvme_auth_wait(struct nvme_ctrl *ctrl, int qid)
 }
 EXPORT_SYMBOL_GPL(nvme_auth_wait);
 
-void nvme_auth_reset(struct nvme_ctrl *ctrl)
-{
-	struct nvme_dhchap_queue_context *chap;
-
-	mutex_lock(&ctrl->dhchap_auth_mutex);
-	list_for_each_entry(chap, &ctrl->dhchap_auth_list, entry) {
-		mutex_unlock(&ctrl->dhchap_auth_mutex);
-		flush_work(&chap->auth_work);
-		nvme_auth_reset_dhchap(chap);
-	}
-	mutex_unlock(&ctrl->dhchap_auth_mutex);
-}
-
 static void nvme_dhchap_auth_work(struct work_struct *work)
 {
 	struct nvme_ctrl *ctrl =
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 4aa855a4839a..c15dbbe509f5 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3757,8 +3757,6 @@ static ssize_t nvme_ctrl_dhchap_secret_store(struct device *dev,
 		host_key = ctrl->host_key;
 		ctrl->host_key = key;
 		nvme_auth_free_key(host_key);
-		/* Key has changed; re-authentication with new key */
-		nvme_auth_reset(ctrl);
 	}
 	/* Start re-authentication */
 	dev_info(ctrl->device, "re-authenticating controller\n");
@@ -3811,8 +3809,6 @@ static ssize_t nvme_ctrl_dhchap_ctrl_secret_store(struct device *dev,
 		ctrl_key = ctrl->ctrl_key;
 		ctrl->ctrl_key = key;
 		nvme_auth_free_key(ctrl_key);
-		/* Key has changed; re-authentication with new key */
-		nvme_auth_reset(ctrl);
 	}
 	/* Start re-authentication */
 	dev_info(ctrl->device, "re-authenticating controller\n");
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index cc314f45ddee..f9708825aa24 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -1023,7 +1023,6 @@ int nvme_auth_init_ctrl(struct nvme_ctrl *ctrl);
 void nvme_auth_stop(struct nvme_ctrl *ctrl);
 int nvme_auth_negotiate(struct nvme_ctrl *ctrl, int qid);
 int nvme_auth_wait(struct nvme_ctrl *ctrl, int qid);
-void nvme_auth_reset(struct nvme_ctrl *ctrl);
 void nvme_auth_free(struct nvme_ctrl *ctrl);
 #else
 static inline void nvme_auth_init_ctrl(struct nvme_ctrl *ctrl) {};
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 12/16] nvme-auth: convert dhchap_auth_list to an array
  2022-11-09  3:44 [PATCH 00/16] nvme: rework dhchap authentication host code Sagi Grimberg
                   ` (10 preceding siblings ...)
  2022-11-09  3:44 ` [PATCH 11/16] nvme-auth: no need to reset chap contexts on re-authentication Sagi Grimberg
@ 2022-11-09  3:44 ` Sagi Grimberg
  2022-11-09  6:48   ` Christoph Hellwig
  2022-11-09 17:40   ` Hannes Reinecke
  2022-11-09  3:44 ` [PATCH 13/16] nvme-auth: remove redundant auth_work flush Sagi Grimberg
                   ` (6 subsequent siblings)
  18 siblings, 2 replies; 43+ messages in thread
From: Sagi Grimberg @ 2022-11-09  3:44 UTC (permalink / raw)
  To: linux-nvme
  Cc: Christoph Hellwig, Hannes Reinecke, Keith Busch, Chaitanya Kulkarni

We know exactly how many dhchap contexts we will need, there is no need
to hold a list that we need to protect with a mutex. Convert to
a dynamically allocated array. And dhchap_context access state is
maintained by the chap itself.

Make dhchap_auth_mutex protect only the ctrl host_key and ctrl_key
in a fine-graned lock such that there is no long lasting acuisition
of the lock.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/auth.c | 113 +++++++++++++++++++++------------------
 drivers/nvme/host/core.c |   4 ++
 drivers/nvme/host/nvme.h |   2 +-
 3 files changed, 65 insertions(+), 54 deletions(-)

diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
index 23d486284da9..b78e0df54f6c 100644
--- a/drivers/nvme/host/auth.c
+++ b/drivers/nvme/host/auth.c
@@ -46,6 +46,14 @@ struct nvme_dhchap_queue_context {
 	(qid == 0) ? 0 : BLK_MQ_REQ_NOWAIT | BLK_MQ_REQ_RESERVED
 #define nvme_auth_queue_from_qid(ctrl, qid) \
 	(qid == 0) ? (ctrl)->fabrics_q : (ctrl)->connect_q
+#define nvme_auth_chap_from_qid(ctrl, qid) \
+	&((struct nvme_dhchap_queue_context*)(ctrl)->dhchap_ctxs)[qid]
+#define ctrl_max_dhchaps(ctrl) \
+	(ctrl)->opts->nr_io_queues + (ctrl)->opts->nr_write_queues + \
+			(ctrl)->opts->nr_poll_queues + 1
+#define nvme_foreach_dhchap(i, chap, ctrl) \
+	for (i = 0, chap = (ctrl)->dhchap_ctxs; \
+	     i < ctrl_max_dhchaps(ctrl) && chap != NULL; i++, chap++)
 
 static int nvme_auth_submit(struct nvme_ctrl *ctrl, int qid,
 			    void *data, size_t data_len, bool auth_send)
@@ -330,7 +338,7 @@ static int nvme_auth_process_dhchap_success1(struct nvme_ctrl *ctrl,
 	struct nvmf_auth_dhchap_success1_data *data = chap->buf;
 	size_t size = sizeof(*data);
 
-	if (ctrl->ctrl_key)
+	if (chap->ctrl_key)
 		size += chap->hash_len;
 
 	if (chap->buf_size < size) {
@@ -418,8 +426,10 @@ static int nvme_auth_dhchap_setup_host_response(struct nvme_ctrl *ctrl,
 		__func__, chap->qid, chap->s1, chap->transaction);
 
 	if (!chap->host_response) {
+		mutex_lock(&ctrl->dhchap_auth_mutex);
 		chap->host_response = nvme_auth_transform_key(ctrl->host_key,
 						ctrl->opts->host->nqn);
+		mutex_unlock(&ctrl->dhchap_auth_mutex);
 		if (IS_ERR(chap->host_response)) {
 			ret = PTR_ERR(chap->host_response);
 			chap->host_response = NULL;
@@ -430,8 +440,10 @@ static int nvme_auth_dhchap_setup_host_response(struct nvme_ctrl *ctrl,
 			__func__, chap->qid);
 	}
 
+	mutex_lock(&ctrl->dhchap_auth_mutex);
 	ret = crypto_shash_setkey(chap->shash_tfm,
 			chap->host_response, ctrl->host_key->len);
+	mutex_unlock(&ctrl->dhchap_auth_mutex);
 	if (ret) {
 		dev_warn(ctrl->device, "qid %d: failed to set key, error %d\n",
 			 chap->qid, ret);
@@ -507,6 +519,7 @@ static int nvme_auth_dhchap_setup_ctrl_response(struct nvme_ctrl *ctrl,
 		ret = PTR_ERR(ctrl_response);
 		return ret;
 	}
+
 	ret = crypto_shash_setkey(chap->shash_tfm,
 			ctrl_response, ctrl->ctrl_key->len);
 	if (ret) {
@@ -665,7 +678,6 @@ static void nvme_auth_free_dhchap(struct nvme_dhchap_queue_context *chap)
 		crypto_free_shash(chap->shash_tfm);
 	if (chap->dh_tfm)
 		crypto_free_kpp(chap->dh_tfm);
-	kfree(chap);
 }
 
 static void __nvme_auth_work(struct work_struct *work)
@@ -789,6 +801,7 @@ static void __nvme_auth_work(struct work_struct *work)
 		return;
 	}
 
+	mutex_lock(&ctrl->dhchap_auth_mutex);
 	if (ctrl->ctrl_key) {
 		dev_dbg(ctrl->device,
 			"%s: qid %d controller response\n",
@@ -816,6 +829,7 @@ static void __nvme_auth_work(struct work_struct *work)
 		if (ret)
 			chap->error = ret;
 	}
+	mutex_unlock(&ctrl->dhchap_auth_mutex);
 	if (!ret) {
 		chap->error = 0;
 		return;
@@ -848,29 +862,8 @@ int nvme_auth_negotiate(struct nvme_ctrl *ctrl, int qid)
 		return -ENOKEY;
 	}
 
-	mutex_lock(&ctrl->dhchap_auth_mutex);
-	/* Check if the context is already queued */
-	list_for_each_entry(chap, &ctrl->dhchap_auth_list, entry) {
-		WARN_ON(!chap->buf);
-		if (chap->qid == qid) {
-			dev_dbg(ctrl->device, "qid %d: re-using context\n", qid);
-			mutex_unlock(&ctrl->dhchap_auth_mutex);
-			flush_work(&chap->auth_work);
-			nvme_auth_reset_dhchap(chap);
-			queue_work(nvme_wq, &chap->auth_work);
-			return 0;
-		}
-	}
-	chap = kzalloc(sizeof(*chap), GFP_KERNEL);
-	if (!chap) {
-		mutex_unlock(&ctrl->dhchap_auth_mutex);
-		return -ENOMEM;
-	}
-	chap->qid = qid;
-	chap->ctrl = ctrl;
-	INIT_WORK(&chap->auth_work, __nvme_auth_work);
-	list_add(&chap->entry, &ctrl->dhchap_auth_list);
-	mutex_unlock(&ctrl->dhchap_auth_mutex);
+	chap = nvme_auth_chap_from_qid(ctrl, qid);
+	cancel_work_sync(&chap->auth_work);
 	queue_work(nvme_wq, &chap->auth_work);
 	return 0;
 }
@@ -881,19 +874,12 @@ int nvme_auth_wait(struct nvme_ctrl *ctrl, int qid)
 	struct nvme_dhchap_queue_context *chap;
 	int ret;
 
-	mutex_lock(&ctrl->dhchap_auth_mutex);
-	list_for_each_entry(chap, &ctrl->dhchap_auth_list, entry) {
-		if (chap->qid != qid)
-			continue;
-		mutex_unlock(&ctrl->dhchap_auth_mutex);
-		flush_work(&chap->auth_work);
-		ret = chap->error;
-		/* clear sensitive info */
-		nvme_auth_reset_dhchap(chap);
-		return ret;
-	}
-	mutex_unlock(&ctrl->dhchap_auth_mutex);
-	return -ENXIO;
+	chap = nvme_auth_chap_from_qid(ctrl, qid);
+	flush_work(&chap->auth_work);
+	ret = chap->error;
+	/* clear sensitive info */
+	nvme_auth_reset_dhchap(chap);
+	return ret;
 }
 EXPORT_SYMBOL_GPL(nvme_auth_wait);
 
@@ -942,11 +928,11 @@ static void nvme_dhchap_auth_work(struct work_struct *work)
 
 int nvme_auth_init_ctrl(struct nvme_ctrl *ctrl)
 {
-	int ret;
+	struct nvme_dhchap_queue_context *chap;
+	int i, ret;
 
-	INIT_LIST_HEAD(&ctrl->dhchap_auth_list);
-	INIT_WORK(&ctrl->dhchap_auth_work, nvme_dhchap_auth_work);
 	mutex_init(&ctrl->dhchap_auth_mutex);
+	INIT_WORK(&ctrl->dhchap_auth_work, nvme_dhchap_auth_work);
 	if (!ctrl->opts)
 		return 0;
 	ret = nvme_auth_generate_key(ctrl->opts->dhchap_secret,
@@ -955,37 +941,57 @@ int nvme_auth_init_ctrl(struct nvme_ctrl *ctrl)
 		return ret;
 	ret = nvme_auth_generate_key(ctrl->opts->dhchap_ctrl_secret,
 			&ctrl->ctrl_key);
-	if (ret) {
-		nvme_auth_free_key(ctrl->host_key);
-		ctrl->host_key = NULL;
+	if (ret)
+		goto err_free_dhchap_secret;
+
+	if (!ctrl->opts->dhchap_secret && !ctrl->opts->dhchap_ctrl_secret)
+		return ret;
+
+	ctrl->dhchap_ctxs = kvcalloc(ctrl_max_dhchaps(ctrl),
+				sizeof(*chap), GFP_KERNEL);
+	if (!ctrl->dhchap_ctxs) {
+		ret = -ENOMEM;
+		goto err_free_dhchap_ctrl_secret;
 	}
+
+	nvme_foreach_dhchap(i, chap, ctrl) {
+		chap->qid = i;
+		chap->ctrl = ctrl;
+		INIT_WORK(&chap->auth_work, __nvme_auth_work);
+	}
+
+out:
 	return ret;
+err_free_dhchap_ctrl_secret:
+	nvme_auth_free_key(ctrl->ctrl_key);
+	ctrl->ctrl_key = NULL;
+err_free_dhchap_secret:
+	nvme_auth_free_key(ctrl->host_key);
+	ctrl->host_key = NULL;
+	goto out;
 }
 EXPORT_SYMBOL_GPL(nvme_auth_init_ctrl);
 
 void nvme_auth_stop(struct nvme_ctrl *ctrl)
 {
-	struct nvme_dhchap_queue_context *chap = NULL, *tmp;
+	struct nvme_dhchap_queue_context *chap;
+	int i;
 
 	cancel_work_sync(&ctrl->dhchap_auth_work);
-	mutex_lock(&ctrl->dhchap_auth_mutex);
-	list_for_each_entry_safe(chap, tmp, &ctrl->dhchap_auth_list, entry)
+	nvme_foreach_dhchap(i, chap, ctrl)
 		cancel_work_sync(&chap->auth_work);
-	mutex_unlock(&ctrl->dhchap_auth_mutex);
 }
 EXPORT_SYMBOL_GPL(nvme_auth_stop);
 
 void nvme_auth_free(struct nvme_ctrl *ctrl)
 {
-	struct nvme_dhchap_queue_context *chap = NULL, *tmp;
+	struct nvme_dhchap_queue_context *chap;
+	int i;
 
-	mutex_lock(&ctrl->dhchap_auth_mutex);
-	list_for_each_entry_safe(chap, tmp, &ctrl->dhchap_auth_list, entry) {
-		list_del_init(&chap->entry);
+	nvme_foreach_dhchap(i, chap, ctrl) {
 		flush_work(&chap->auth_work);
 		nvme_auth_free_dhchap(chap);
 	}
-	mutex_unlock(&ctrl->dhchap_auth_mutex);
 	if (ctrl->host_key) {
 		nvme_auth_free_key(ctrl->host_key);
 		ctrl->host_key = NULL;
@@ -994,5 +1000,6 @@ void nvme_auth_free(struct nvme_ctrl *ctrl)
 		nvme_auth_free_key(ctrl->ctrl_key);
 		ctrl->ctrl_key = NULL;
 	}
+	kfree(ctrl->dhchap_ctxs);
 }
 EXPORT_SYMBOL_GPL(nvme_auth_free);
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index c15dbbe509f5..d8e6ff4257e9 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3755,7 +3755,9 @@ static ssize_t nvme_ctrl_dhchap_secret_store(struct device *dev,
 		kfree(opts->dhchap_secret);
 		opts->dhchap_secret = dhchap_secret;
 		host_key = ctrl->host_key;
+		mutex_lock(&ctrl->dhchap_auth_mutex);
 		ctrl->host_key = key;
+		mutex_unlock(&ctrl->dhchap_auth_mutex);
 		nvme_auth_free_key(host_key);
 	}
 	/* Start re-authentication */
@@ -3807,7 +3809,9 @@ static ssize_t nvme_ctrl_dhchap_ctrl_secret_store(struct device *dev,
 		kfree(opts->dhchap_ctrl_secret);
 		opts->dhchap_ctrl_secret = dhchap_secret;
 		ctrl_key = ctrl->ctrl_key;
+		mutex_lock(&ctrl->dhchap_auth_mutex);
 		ctrl->ctrl_key = key;
+		mutex_unlock(&ctrl->dhchap_auth_mutex);
 		nvme_auth_free_key(ctrl_key);
 	}
 	/* Start re-authentication */
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index f9708825aa24..7d12759857d7 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -336,8 +336,8 @@ struct nvme_ctrl {
 
 #ifdef CONFIG_NVME_AUTH
 	struct work_struct dhchap_auth_work;
-	struct list_head dhchap_auth_list;
 	struct mutex dhchap_auth_mutex;
+	void *dhchap_ctxs;
 	struct nvme_dhchap_key *host_key;
 	struct nvme_dhchap_key *ctrl_key;
 	u16 transaction;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 13/16] nvme-auth: remove redundant auth_work flush
  2022-11-09  3:44 [PATCH 00/16] nvme: rework dhchap authentication host code Sagi Grimberg
                   ` (11 preceding siblings ...)
  2022-11-09  3:44 ` [PATCH 12/16] nvme-auth: convert dhchap_auth_list to an array Sagi Grimberg
@ 2022-11-09  3:44 ` Sagi Grimberg
  2022-11-09 17:41   ` Hannes Reinecke
  2022-11-09  3:44 ` [PATCH 14/16] nvme-auth: have dhchap_auth_work wait for queues auth to complete Sagi Grimberg
                   ` (5 subsequent siblings)
  18 siblings, 1 reply; 43+ messages in thread
From: Sagi Grimberg @ 2022-11-09  3:44 UTC (permalink / raw)
  To: linux-nvme
  Cc: Christoph Hellwig, Hannes Reinecke, Keith Busch, Chaitanya Kulkarni

only ctrl deletion calls nvme_auth_free, which was stopped prior in the
teardown stage, so there is no possibility that it should ever run when
nvme_auth_free is called.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/auth.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
index b78e0df54f6c..0159179c2455 100644
--- a/drivers/nvme/host/auth.c
+++ b/drivers/nvme/host/auth.c
@@ -988,10 +988,8 @@ void nvme_auth_free(struct nvme_ctrl *ctrl)
 	struct nvme_dhchap_queue_context *chap;
 	int i;
 
-	nvme_foreach_dhchap(i, chap, ctrl) {
-		flush_work(&chap->auth_work);
+	nvme_foreach_dhchap(i, chap, ctrl)
 		nvme_auth_free_dhchap(chap);
-	}
 	if (ctrl->host_key) {
 		nvme_auth_free_key(ctrl->host_key);
 		ctrl->host_key = NULL;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 14/16] nvme-auth: have dhchap_auth_work wait for queues auth to complete
  2022-11-09  3:44 [PATCH 00/16] nvme: rework dhchap authentication host code Sagi Grimberg
                   ` (12 preceding siblings ...)
  2022-11-09  3:44 ` [PATCH 13/16] nvme-auth: remove redundant auth_work flush Sagi Grimberg
@ 2022-11-09  3:44 ` Sagi Grimberg
  2022-11-09  7:44   ` Hannes Reinecke
  2022-11-09  3:44 ` [PATCH 15/16] nvme-tcp: stop auth work after tearing down queues in error recovery Sagi Grimberg
                   ` (4 subsequent siblings)
  18 siblings, 1 reply; 43+ messages in thread
From: Sagi Grimberg @ 2022-11-09  3:44 UTC (permalink / raw)
  To: linux-nvme
  Cc: Christoph Hellwig, Hannes Reinecke, Keith Busch, Chaitanya Kulkarni

It triggered the queue authentication work elements in parallel, but
the ctrl authentication work itself completes when all of them
completes. Hence wait for queues auth completions.

This also makes nvme_auth_stop simply a sync cancel of ctrl
dhchap_auth_work.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/auth.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
index 0159179c2455..e88d6254f2f5 100644
--- a/drivers/nvme/host/auth.c
+++ b/drivers/nvme/host/auth.c
@@ -924,6 +924,12 @@ static void nvme_dhchap_auth_work(struct work_struct *work)
 	 * Failure is a soft-state; credentials remain valid until
 	 * the controller terminates the connection.
 	 */
+	for (q = 1; q < ctrl->queue_count; q++) {
+		ret = nvme_auth_wait(ctrl, q);
+		if (ret)
+			dev_warn(ctrl->device,
+				 "qid %d: authentication failed\n", q);
+	}
 }
 
 int nvme_auth_init_ctrl(struct nvme_ctrl *ctrl)
@@ -974,12 +980,7 @@ EXPORT_SYMBOL_GPL(nvme_auth_init_ctrl);
 
 void nvme_auth_stop(struct nvme_ctrl *ctrl)
 {
-	struct nvme_dhchap_queue_context *chap;
-	int i;
-
 	cancel_work_sync(&ctrl->dhchap_auth_work);
-	nvme_foreach_dhchap(i, chap, ctrl)
-		cancel_work_sync(&chap->auth_work);
 }
 EXPORT_SYMBOL_GPL(nvme_auth_stop);
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 15/16] nvme-tcp: stop auth work after tearing down queues in error recovery
  2022-11-09  3:44 [PATCH 00/16] nvme: rework dhchap authentication host code Sagi Grimberg
                   ` (13 preceding siblings ...)
  2022-11-09  3:44 ` [PATCH 14/16] nvme-auth: have dhchap_auth_work wait for queues auth to complete Sagi Grimberg
@ 2022-11-09  3:44 ` Sagi Grimberg
  2022-11-09  3:44 ` [PATCH 16/16] nvme-rdma: " Sagi Grimberg
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 43+ messages in thread
From: Sagi Grimberg @ 2022-11-09  3:44 UTC (permalink / raw)
  To: linux-nvme
  Cc: Christoph Hellwig, Hannes Reinecke, Keith Busch, Chaitanya Kulkarni

when starting error recovery there might be a authentication work
running, and it involves I/O commands. Given the controller is tearing
down there is no chance for the I/O to complete other than timing out
which may unnecessarily take a full io timeout.

So first tear down the queues, fail/cancel all inflight I/O (including
potentially authentication) and only then stop authentication. This
ensures that failover is not stalled due to blocked authentication I/O.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/tcp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 339dc9e94f5c..d78fdfac635e 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -2128,7 +2128,6 @@ static void nvme_tcp_error_recovery_work(struct work_struct *work)
 				struct nvme_tcp_ctrl, err_work);
 	struct nvme_ctrl *ctrl = &tcp_ctrl->ctrl;
 
-	nvme_auth_stop(ctrl);
 	nvme_stop_keep_alive(ctrl);
 	flush_work(&ctrl->async_event_work);
 	nvme_tcp_teardown_io_queues(ctrl, false);
@@ -2136,6 +2135,7 @@ static void nvme_tcp_error_recovery_work(struct work_struct *work)
 	nvme_start_queues(ctrl);
 	nvme_tcp_teardown_admin_queue(ctrl, false);
 	nvme_start_admin_queue(ctrl);
+	nvme_auth_stop(ctrl);
 
 	if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_CONNECTING)) {
 		/* state change failure is ok if we started ctrl delete */
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 16/16] nvme-rdma: stop auth work after tearing down queues in error recovery
  2022-11-09  3:44 [PATCH 00/16] nvme: rework dhchap authentication host code Sagi Grimberg
                   ` (14 preceding siblings ...)
  2022-11-09  3:44 ` [PATCH 15/16] nvme-tcp: stop auth work after tearing down queues in error recovery Sagi Grimberg
@ 2022-11-09  3:44 ` Sagi Grimberg
  2022-11-09  3:44 ` [PATCH 17/16 blktests] nvme: add re-authentication running concurrently with reset Sagi Grimberg
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 43+ messages in thread
From: Sagi Grimberg @ 2022-11-09  3:44 UTC (permalink / raw)
  To: linux-nvme
  Cc: Christoph Hellwig, Hannes Reinecke, Keith Busch, Chaitanya Kulkarni

when starting error recovery there might be a authentication work
running, and it involves I/O commands. Given the controller is tearing
down there is no chance for the I/O to complete other than timing out
which may unnecessarily take a full io timeout.

So first tear down the queues, fail/cancel all inflight I/O (including
potentially authentication) and only then stop authentication. This
ensures that failover is not stalled due to blocked authentication I/O.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/rdma.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 6e079abb22ee..ff1a6924f339 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -1153,13 +1153,13 @@ static void nvme_rdma_error_recovery_work(struct work_struct *work)
 	struct nvme_rdma_ctrl *ctrl = container_of(work,
 			struct nvme_rdma_ctrl, err_work);
 
-	nvme_auth_stop(&ctrl->ctrl);
 	nvme_stop_keep_alive(&ctrl->ctrl);
 	flush_work(&ctrl->ctrl.async_event_work);
 	nvme_rdma_teardown_io_queues(ctrl, false);
 	nvme_start_queues(&ctrl->ctrl);
 	nvme_rdma_teardown_admin_queue(ctrl, false);
 	nvme_start_admin_queue(&ctrl->ctrl);
+	nvme_auth_stop(&ctrl->ctrl);
 
 	if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING)) {
 		/* state change failure is ok if we started ctrl delete */
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 17/16 blktests] nvme: add re-authentication running concurrently with reset
  2022-11-09  3:44 [PATCH 00/16] nvme: rework dhchap authentication host code Sagi Grimberg
                   ` (15 preceding siblings ...)
  2022-11-09  3:44 ` [PATCH 16/16] nvme-rdma: " Sagi Grimberg
@ 2022-11-09  3:44 ` Sagi Grimberg
  2022-11-09  3:47 ` [PATCH 00/16] nvme: rework dhchap authentication host code Sagi Grimberg
  2022-11-09  7:45 ` Hannes Reinecke
  18 siblings, 0 replies; 43+ messages in thread
From: Sagi Grimberg @ 2022-11-09  3:44 UTC (permalink / raw)
  To: linux-nvme
  Cc: Christoph Hellwig, Hannes Reinecke, Keith Busch, Chaitanya Kulkarni

when re-authentication with new host-key and/or ctrl-key there is an
async work that is scheduled async. A controller reset should be mutual
exclusive with the re-authentication work. Add a test that triggers
re-authentication and immediately resets the controller.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 tests/nvme/046     | 112 +++++++++++++++++++++++++++++++++++++++++++++
 tests/nvme/046.out |  23 ++++++++++
 2 files changed, 135 insertions(+)
 create mode 100755 tests/nvme/046
 create mode 100644 tests/nvme/046.out

diff --git a/tests/nvme/046 b/tests/nvme/046
new file mode 100755
index 000000000000..2cd1b1bff4f1
--- /dev/null
+++ b/tests/nvme/046
@@ -0,0 +1,112 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-3.0+
+# Copyright (C) 2022 Hannes Reinecke, SUSE Labs
+#
+# Test re-authentication and reset running in parallel
+
+. tests/nvme/rc
+
+DESCRIPTION="Test re-authentication concurrently with reset"
+QUICK=1
+
+requires() {
+	_nvme_requires
+	_have_loop
+	_have_kernel_option NVME_AUTH
+	_have_kernel_option NVME_TARGET_AUTH
+	_require_nvme_trtype_is_fabrics
+	_require_nvme_cli_auth
+	_have_driver dh_generic
+}
+
+
+test() {
+	local port
+	local subsys_name="blktests-subsystem-1"
+	local hostid
+	local hostnqn
+	local file_path="${TMPDIR}/img"
+	local hostkey
+	local new_hostkey
+	local ctrlkey
+	local new_ctrlkey
+	local ctrldev
+
+	echo "Running ${TEST_NAME}"
+
+	hostid="$(uuidgen)"
+	if [ -z "$hostid" ] ; then
+		echo "uuidgen failed"
+		return 1
+	fi
+	hostnqn="nqn.2014-08.org.nvmexpress:uuid:${hostid}"
+
+	hostkey="$(nvme gen-dhchap-key -n ${subsys_name} 2> /dev/null)"
+	if [ -z "$hostkey" ] ; then
+		echo "failed to generate host key"
+		return 1
+	fi
+
+	ctrlkey="$(nvme gen-dhchap-key -n ${subsys_name} 2> /dev/null)"
+	if [ -z "$ctrlkey" ] ; then
+		echo "failed to generate ctrl key"
+		return 1
+	fi
+
+	_setup_nvmet
+
+	truncate -s 512M "${file_path}"
+
+	_create_nvmet_subsystem "${subsys_name}" "${file_path}"
+	port="$(_create_nvmet_port "${nvme_trtype}")"
+	_add_nvmet_subsys_to_port "${port}" "${subsys_name}"
+	_create_nvmet_host "${subsys_name}" "${hostnqn}" "${hostkey}" "${ctrlkey}"
+
+	_set_nvmet_dhgroup "${hostnqn}" "ffdhe2048"
+
+	_nvme_connect_subsys "${nvme_trtype}" "${subsys_name}" \
+			     "${def_traddr}" "${def_trsvcid}" \
+			     "${hostnqn}" "${hostid}" \
+			     "${hostkey}" "${ctrlkey}"
+
+	udevadm settle
+
+	ctrldev=$(_find_nvme_dev "${subsys_name}")
+	if [ -z "$ctrldev" ] ; then
+		echo "nvme controller not found"
+	fi
+
+	hostkey_file="/sys/class/nvme/${ctrldev}/dhchap_secret"
+	ctrlkey_file="/sys/class/nvme/${ctrldev}/dhchap_ctrl_secret"
+
+	for ((i = 0; i < 10; i++)); do
+		echo "Re-authenticate with new host key (iteration $i)"
+		new_hostkey="$(nvme gen-dhchap-key -n ${subsys_name} 2> /dev/null)"
+		_set_nvmet_hostkey "${hostnqn}" "${new_hostkey}"
+		echo "${new_hostkey}" > "${hostkey_file}"
+		_nvme_reset_ctrl $ctrldev
+
+		echo "Re-authenticate with new ctrl key (iteration $i)"
+		new_ctrlkey="$(nvme gen-dhchap-key -n ${subsys_name} 2> /dev/null)"
+		_set_nvmet_ctrlkey "${hostnqn}" "${new_ctrlkey}"
+		echo "${new_ctrlkey}" > "${ctrlkey_file}"
+		_nvme_reset_ctrl $ctrldev
+	done
+
+	nvmedev=$(_find_nvme_dev "${subsys_name}")
+
+	_run_fio_rand_io --size=8m --filename="/dev/${nvmedev}n1"
+
+	_nvme_disconnect_subsys "${subsys_name}"
+
+	_remove_nvmet_subsystem_from_port "${port}" "${subsys_name}"
+	_remove_nvmet_subsystem "${subsys_name}"
+
+	_remove_nvmet_port "${port}"
+
+	_remove_nvmet_host "${hostnqn}"
+
+	rm "${file_path}"
+
+	echo "Test complete"
+}
diff --git a/tests/nvme/046.out b/tests/nvme/046.out
new file mode 100644
index 000000000000..2d63f4e3112b
--- /dev/null
+++ b/tests/nvme/046.out
@@ -0,0 +1,23 @@
+Running nvme/046
+Re-authenticate with new host key (iteration 0)
+Re-authenticate with new ctrl key (iteration 0)
+Re-authenticate with new host key (iteration 1)
+Re-authenticate with new ctrl key (iteration 1)
+Re-authenticate with new host key (iteration 2)
+Re-authenticate with new ctrl key (iteration 2)
+Re-authenticate with new host key (iteration 3)
+Re-authenticate with new ctrl key (iteration 3)
+Re-authenticate with new host key (iteration 4)
+Re-authenticate with new ctrl key (iteration 4)
+Re-authenticate with new host key (iteration 5)
+Re-authenticate with new ctrl key (iteration 5)
+Re-authenticate with new host key (iteration 6)
+Re-authenticate with new ctrl key (iteration 6)
+Re-authenticate with new host key (iteration 7)
+Re-authenticate with new ctrl key (iteration 7)
+Re-authenticate with new host key (iteration 8)
+Re-authenticate with new ctrl key (iteration 8)
+Re-authenticate with new host key (iteration 9)
+Re-authenticate with new ctrl key (iteration 9)
+NQN:blktests-subsystem-1 disconnected 1 controller(s)
+Test complete
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH 00/16] nvme: rework dhchap authentication host code
  2022-11-09  3:44 [PATCH 00/16] nvme: rework dhchap authentication host code Sagi Grimberg
                   ` (16 preceding siblings ...)
  2022-11-09  3:44 ` [PATCH 17/16 blktests] nvme: add re-authentication running concurrently with reset Sagi Grimberg
@ 2022-11-09  3:47 ` Sagi Grimberg
  2022-11-09  7:45 ` Hannes Reinecke
  18 siblings, 0 replies; 43+ messages in thread
From: Sagi Grimberg @ 2022-11-09  3:47 UTC (permalink / raw)
  To: linux-nvme
  Cc: Christoph Hellwig, Hannes Reinecke, Keith Busch, Chaitanya Kulkarni


> Currently the authentication code is fairly fragile with respect to
> mutual locking between secrets sysfs override, re-authentication, and
> controller resets.


Title is misleading... not reworking the authentication code as a whole.
just the arrangements of how the queue chap contexts are handled...

A more appropriate title would be: few cleanups, fixes and enhancements
to the host authentication code.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 07/16] nvme-auth: remove redundant if statement
  2022-11-09  3:44 ` [PATCH 07/16] nvme-auth: remove redundant if statement Sagi Grimberg
@ 2022-11-09  6:37   ` Christoph Hellwig
  2022-11-09 18:42     ` Sagi Grimberg
  2022-11-09  7:34   ` Hannes Reinecke
  1 sibling, 1 reply; 43+ messages in thread
From: Christoph Hellwig @ 2022-11-09  6:37 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: linux-nvme, Christoph Hellwig, Hannes Reinecke, Keith Busch,
	Chaitanya Kulkarni

On Wed, Nov 09, 2022 at 05:44:10AM +0200, Sagi Grimberg wrote:
> diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
> index a6312a8846f8..07d497f49e4b 100644
> --- a/drivers/nvme/host/auth.c
> +++ b/drivers/nvme/host/auth.c
> @@ -874,7 +874,7 @@ int nvme_auth_negotiate(struct nvme_ctrl *ctrl, int qid)
>  		mutex_unlock(&ctrl->dhchap_auth_mutex);
>  		return -ENOMEM;
>  	}
> -	chap->qid = (qid == NVME_QID_ANY) ? 0 : qid;
> +	chap->qid = qid;

Btw, the same "transformation" can also be removed in nvme_auth_submit.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 08/16] nvme-auth: don't keep long lived 4k dhchap buffer
  2022-11-09  3:44 ` [PATCH 08/16] nvme-auth: don't keep long lived 4k dhchap buffer Sagi Grimberg
@ 2022-11-09  6:39   ` Christoph Hellwig
  2022-11-09 18:05     ` Sagi Grimberg
  2022-11-09 17:37   ` Hannes Reinecke
  1 sibling, 1 reply; 43+ messages in thread
From: Christoph Hellwig @ 2022-11-09  6:39 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: linux-nvme, Christoph Hellwig, Hannes Reinecke, Keith Busch,
	Chaitanya Kulkarni

On Wed, Nov 09, 2022 at 05:44:11AM +0200, Sagi Grimberg wrote:
> dhchap structure is per-queue, it is wasteful to keep it for the entire
> lifetime of the queue. Allocate it dynamically and get rid of it after
> authentication. We don't need kzalloc because all accessors are clearing
> it before writing to it.

Btw, do we need a mempool here to make sure we always get one for
a reconnect under memory pressure?

> +	/*
> +	 * Allocate a large enough buffer for the entire negotiation:
> +	 * 4k should be enough to ffdhe8192.
> +	 */
> +	chap->buf_size = 4096;

Same comments as for Hannes previous series:  no need for ->buf_size,
this can be a global constant.

And in the comment the should is weird, it is enough.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 12/16] nvme-auth: convert dhchap_auth_list to an array
  2022-11-09  3:44 ` [PATCH 12/16] nvme-auth: convert dhchap_auth_list to an array Sagi Grimberg
@ 2022-11-09  6:48   ` Christoph Hellwig
  2022-11-09 19:06     ` Sagi Grimberg
  2022-11-09 17:40   ` Hannes Reinecke
  1 sibling, 1 reply; 43+ messages in thread
From: Christoph Hellwig @ 2022-11-09  6:48 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: linux-nvme, Christoph Hellwig, Hannes Reinecke, Keith Busch,
	Chaitanya Kulkarni

> Make dhchap_auth_mutex protect only the ctrl host_key and ctrl_key
> in a fine-graned lock such that there is no long lasting acuisition
> of the lock.

Can you split that part into a separate patch?  Right now the patch
is a bit confusing because it does multiple things.


> +#define nvme_auth_chap_from_qid(ctrl, qid) \
> +	&((struct nvme_dhchap_queue_context*)(ctrl)->dhchap_ctxs)[qid]

Please properly type dhchap_ctxs instead of the ugly cast here.  That
also removed the need for the macro.
functions instead of macros.

> +#define ctrl_max_dhchaps(ctrl) \
> +	(ctrl)->opts->nr_io_queues + (ctrl)->opts->nr_write_queues + \
> +			(ctrl)->opts->nr_poll_queues + 1

This overallocates in case the controller supports less queues than
requested, but I guess that's not a big problem in practice.

Also please make this an inline function to add type safety and to make
it more readable.

> +#define nvme_foreach_dhchap(i, chap, ctrl) \
> +	for (i = 0, chap = (ctrl)->dhchap_ctxs; \
> +	     i < ctrl_max_dhchaps(ctrl) && chap != NULL; i++, chap++)

I find this macro a bit obsfucating.  There's only three callers anyway,
and they only use chap 1, 2 or three times respectively.

So I'd just loop:

	for (i = 0, i < ctrl_max_dhchaps(ctrl); i++)

and then do the dereference of >dhchap_ctxs in the loop.

>  			    void *data, size_t data_len, bool auth_send)
> @@ -330,7 +338,7 @@ static int nvme_auth_process_dhchap_success1(struct nvme_ctrl *ctrl,
>  	struct nvmf_auth_dhchap_success1_data *data = chap->buf;
>  	size_t size = sizeof(*data);
>  
> -	if (ctrl->ctrl_key)
> +	if (chap->ctrl_key)

How is this related to the rest of the patch?

> +		INIT_WORK(&chap->auth_work, __nvme_auth_work);

Btw, this really should lose the __ prefix in another prep patch.

> +	}
> +
> +out:
>  	return ret;

We can just drop the out label and return directly.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 02/16] nvme-auth: remove symbol export from nvme_auth_reset
  2022-11-09  3:44 ` [PATCH 02/16] nvme-auth: remove symbol export from nvme_auth_reset Sagi Grimberg
@ 2022-11-09  7:30   ` Hannes Reinecke
  0 siblings, 0 replies; 43+ messages in thread
From: Hannes Reinecke @ 2022-11-09  7:30 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme
  Cc: Christoph Hellwig, Keith Busch, Chaitanya Kulkarni

On 11/9/22 04:44, Sagi Grimberg wrote:
> Only the nvme module calls it.
> 
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
> ---
>   drivers/nvme/host/auth.c | 1 -
>   1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
> index d45333268fcf..734928282d3e 100644
> --- a/drivers/nvme/host/auth.c
> +++ b/drivers/nvme/host/auth.c
> @@ -932,7 +932,6 @@ void nvme_auth_reset(struct nvme_ctrl *ctrl)
>   	}
>   	mutex_unlock(&ctrl->dhchap_auth_mutex);
>   }
> -EXPORT_SYMBOL_GPL(nvme_auth_reset);
>   
>   static void nvme_dhchap_auth_work(struct work_struct *work)
>   {

Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman



^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 03/16] nvme-auth: don't re-authenticate if the controller is not LIVE
  2022-11-09  3:44 ` [PATCH 03/16] nvme-auth: don't re-authenticate if the controller is not LIVE Sagi Grimberg
@ 2022-11-09  7:33   ` Hannes Reinecke
  2022-11-09 19:00     ` Sagi Grimberg
  0 siblings, 1 reply; 43+ messages in thread
From: Hannes Reinecke @ 2022-11-09  7:33 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme
  Cc: Christoph Hellwig, Keith Busch, Chaitanya Kulkarni

On 11/9/22 04:44, Sagi Grimberg wrote:
> The connect sequence will re-authenticate.
> 
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
> ---
>   drivers/nvme/host/auth.c | 7 +++++++
>   1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
> index 734928282d3e..93c0fc71bc7c 100644
> --- a/drivers/nvme/host/auth.c
> +++ b/drivers/nvme/host/auth.c
> @@ -939,6 +939,13 @@ static void nvme_dhchap_auth_work(struct work_struct *work)
>   		container_of(work, struct nvme_ctrl, dhchap_auth_work);
>   	int ret, q;
>   
> +	/*
> +	 * If the ctrl is no connected, bail as reconnect will handle
> +	 * authentication.
> +	 */
> +	if (ctrl->state != NVME_CTRL_LIVE)
> +		return;
> +
>   	/* Authenticate admin queue first */
>   	ret = nvme_auth_negotiate(ctrl, 0);
>   	if (ret) {

What about a state check in __nvme_auth_work()? That should only be 
called if the queue is not in CONNECTING or LIVE.

Otherwise:
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman



^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 05/16] nvme-auth: don't ignore key generation failures when initializing ctrl keys
  2022-11-09  3:44 ` [PATCH 05/16] nvme-auth: don't ignore key generation failures when initializing ctrl keys Sagi Grimberg
@ 2022-11-09  7:33   ` Hannes Reinecke
  0 siblings, 0 replies; 43+ messages in thread
From: Hannes Reinecke @ 2022-11-09  7:33 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme
  Cc: Christoph Hellwig, Keith Busch, Chaitanya Kulkarni

On 11/9/22 04:44, Sagi Grimberg wrote:
> nvme_auth_generate_key can fail, don't ignore it upon initialization.
> 
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
> ---
>   drivers/nvme/host/auth.c | 19 +++++++++++++++----
>   drivers/nvme/host/core.c |  6 +++++-
>   drivers/nvme/host/nvme.h |  2 +-
>   3 files changed, 21 insertions(+), 6 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman



^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 06/16] nvme-auth: don't override ctrl keys before validation
  2022-11-09  3:44 ` [PATCH 06/16] nvme-auth: don't override ctrl keys before validation Sagi Grimberg
@ 2022-11-09  7:34   ` Hannes Reinecke
  0 siblings, 0 replies; 43+ messages in thread
From: Hannes Reinecke @ 2022-11-09  7:34 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme
  Cc: Christoph Hellwig, Keith Busch, Chaitanya Kulkarni

On 11/9/22 04:44, Sagi Grimberg wrote:
> Replace ctrl ctrl_key/host_key only after nvme_auth_generate_key is successful.
> Also, this fixes a bug where the keys are leaked.
> 
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
> ---
>   drivers/nvme/host/core.c | 12 ++++++++++--
>   1 file changed, 10 insertions(+), 2 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman



^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 07/16] nvme-auth: remove redundant if statement
  2022-11-09  3:44 ` [PATCH 07/16] nvme-auth: remove redundant if statement Sagi Grimberg
  2022-11-09  6:37   ` Christoph Hellwig
@ 2022-11-09  7:34   ` Hannes Reinecke
  1 sibling, 0 replies; 43+ messages in thread
From: Hannes Reinecke @ 2022-11-09  7:34 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme
  Cc: Christoph Hellwig, Keith Busch, Chaitanya Kulkarni

On 11/9/22 04:44, Sagi Grimberg wrote:
> No one passes NVME_QID_ANY to nvme_auth_negotiate.
> 
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
> ---
>   drivers/nvme/host/auth.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
> index a6312a8846f8..07d497f49e4b 100644
> --- a/drivers/nvme/host/auth.c
> +++ b/drivers/nvme/host/auth.c
> @@ -874,7 +874,7 @@ int nvme_auth_negotiate(struct nvme_ctrl *ctrl, int qid)
>   		mutex_unlock(&ctrl->dhchap_auth_mutex);
>   		return -ENOMEM;
>   	}
> -	chap->qid = (qid == NVME_QID_ANY) ? 0 : qid;
> +	chap->qid = qid;
>   	chap->ctrl = ctrl;
>   
>   	/*
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman



^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 09/16] nvme-auth: clear sensitive info right after authentication completes
  2022-11-09  3:44 ` [PATCH 09/16] nvme-auth: clear sensitive info right after authentication completes Sagi Grimberg
@ 2022-11-09  7:35   ` Hannes Reinecke
  0 siblings, 0 replies; 43+ messages in thread
From: Hannes Reinecke @ 2022-11-09  7:35 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme
  Cc: Christoph Hellwig, Keith Busch, Chaitanya Kulkarni

On 11/9/22 04:44, Sagi Grimberg wrote:
> We don't want to keep authentication sensitive info in memory for unlimited
> amount of time.
> 
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
> ---
>   drivers/nvme/host/auth.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
> index 201f25267685..484315efa0b2 100644
> --- a/drivers/nvme/host/auth.c
> +++ b/drivers/nvme/host/auth.c
> @@ -908,6 +908,8 @@ int nvme_auth_wait(struct nvme_ctrl *ctrl, int qid)
>   		mutex_unlock(&ctrl->dhchap_auth_mutex);
>   		flush_work(&chap->auth_work);
>   		ret = chap->error;
> +		/* clear sensitive info */
> +		nvme_auth_reset_dhchap(chap);
>   		return ret;
>   	}
>   	mutex_unlock(&ctrl->dhchap_auth_mutex);

Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman



^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 11/16] nvme-auth: no need to reset chap contexts on re-authentication
  2022-11-09  3:44 ` [PATCH 11/16] nvme-auth: no need to reset chap contexts on re-authentication Sagi Grimberg
@ 2022-11-09  7:36   ` Hannes Reinecke
  0 siblings, 0 replies; 43+ messages in thread
From: Hannes Reinecke @ 2022-11-09  7:36 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme
  Cc: Christoph Hellwig, Keith Busch, Chaitanya Kulkarni

On 11/9/22 04:44, Sagi Grimberg wrote:
> Now that the chap context is reset upon completion, this is no longer
> needed. Also remove nvme_auth_reset as no callers are left.
> 
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
> ---
>   drivers/nvme/host/auth.c | 13 -------------
>   drivers/nvme/host/core.c |  4 ----
>   drivers/nvme/host/nvme.h |  1 -
>   3 files changed, 18 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman



^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 14/16] nvme-auth: have dhchap_auth_work wait for queues auth to complete
  2022-11-09  3:44 ` [PATCH 14/16] nvme-auth: have dhchap_auth_work wait for queues auth to complete Sagi Grimberg
@ 2022-11-09  7:44   ` Hannes Reinecke
  2022-11-09 19:01     ` Sagi Grimberg
  0 siblings, 1 reply; 43+ messages in thread
From: Hannes Reinecke @ 2022-11-09  7:44 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme
  Cc: Christoph Hellwig, Keith Busch, Chaitanya Kulkarni

On 11/9/22 04:44, Sagi Grimberg wrote:
> It triggered the queue authentication work elements in parallel, but
> the ctrl authentication work itself completes when all of them
> completes. Hence wait for queues auth completions.
> 
> This also makes nvme_auth_stop simply a sync cancel of ctrl
> dhchap_auth_work.
> 
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
> ---
>   drivers/nvme/host/auth.c | 11 ++++++-----
>   1 file changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
> index 0159179c2455..e88d6254f2f5 100644
> --- a/drivers/nvme/host/auth.c
> +++ b/drivers/nvme/host/auth.c
> @@ -924,6 +924,12 @@ static void nvme_dhchap_auth_work(struct work_struct *work)
>   	 * Failure is a soft-state; credentials remain valid until
>   	 * the controller terminates the connection.
>   	 */
> +	for (q = 1; q < ctrl->queue_count; q++) {
> +		ret = nvme_auth_wait(ctrl, q);
> +		if (ret)
> +			dev_warn(ctrl->device,
> +				 "qid %d: authentication failed\n", q);
> +	}
>   }
>   
Don't we need to clear the dhchap context here?

>   int nvme_auth_init_ctrl(struct nvme_ctrl *ctrl)
> @@ -974,12 +980,7 @@ EXPORT_SYMBOL_GPL(nvme_auth_init_ctrl);
>   
>   void nvme_auth_stop(struct nvme_ctrl *ctrl)
>   {
> -	struct nvme_dhchap_queue_context *chap;
> -	int i;
> -
>   	cancel_work_sync(&ctrl->dhchap_auth_work);
> -	nvme_foreach_dhchap(i, chap, ctrl)
> -		cancel_work_sync(&chap->auth_work);
>   }
>   EXPORT_SYMBOL_GPL(nvme_auth_stop);
>   
Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman



^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 00/16] nvme: rework dhchap authentication host code
  2022-11-09  3:44 [PATCH 00/16] nvme: rework dhchap authentication host code Sagi Grimberg
                   ` (17 preceding siblings ...)
  2022-11-09  3:47 ` [PATCH 00/16] nvme: rework dhchap authentication host code Sagi Grimberg
@ 2022-11-09  7:45 ` Hannes Reinecke
  2022-11-09 17:42   ` Sagi Grimberg
  18 siblings, 1 reply; 43+ messages in thread
From: Hannes Reinecke @ 2022-11-09  7:45 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme
  Cc: Christoph Hellwig, Keith Busch, Chaitanya Kulkarni

On 11/9/22 04:44, Sagi Grimberg wrote:
> Currently the authentication code is fairly fragile with respect to
> mutual locking between secrets sysfs override, re-authentication, and
> controller resets.
> 
> This patch set attempts to resolve these issues by:
> 1. freeing queue chap context as soon as authentication completes
> 2. allocates a simple vector for queue chap contexts so there is
> no list/tree traversal to resolve queue chap context. queue chap
> contexts are 1x1 mapped to queues, which are stored in a vector as
> well.
> 3. flush chap auth_work from the ctrl dhchap work, this simplifies
> how we flush inflight authentication sequence
> 4. use ctrl dhchap_auth_mutex to protect only the resources that are
> accessed and modified via sysfs and the authentication flow (i.e. ctrl
> host_key and ctrl_key)
> 5. move drivers (rdma/tcp) nvme_auth_stop later in the error recovery
> flow to expedite failover and not block on I/O.
> 
> 
> Feedback is welcome.
> 
> Sagi Grimberg (16):
>    nvme-auth: rename __nvme_auth_[reset|free] to
>      nvme_auth[reset|free]_dhchap
>    nvme-auth: remove symbol export from nvme_auth_reset
>    nvme-auth: don't re-authenticate if the controller is not LIVE
>    nvme-auth: remove redundant buffer deallocations
>    nvme-auth: don't ignore key generation failures when initializing ctrl
>      keys
>    nvme-auth: don't override ctrl keys before validation
>    nvme-auth: remove redundant if statement
>    nvme-auth: don't keep long lived 4k dhchap buffer
>    nvme-auth: clear sensitive info right after authentication completes
>    nvme-auth: remove redundant deallocations
>    nvme-auth: no need to reset chap contexts on re-authentication
>    nvme-auth: convert dhchap_auth_list to an array
>    nvme-auth: remove redundant auth_work flush
>    nvme-auth: have dhchap_auth_work wait for queues auth to complete
>    nvme-tcp: stop auth work after tearing down queues in error recovery
>    nvme-rdma: stop auth work after tearing down queues in error recovery
> 
>   drivers/nvme/host/auth.c | 209 ++++++++++++++++++---------------------
>   drivers/nvme/host/core.c |  26 +++--
>   drivers/nvme/host/nvme.h |   5 +-
>   drivers/nvme/host/rdma.c |   2 +-
>   drivers/nvme/host/tcp.c  |   2 +-
>   5 files changed, 121 insertions(+), 123 deletions(-)
> 

For some reason I'm missing parts of this patchset (patches 1, 4, 8, 10, 
and 12); have they been eaten by my mailer?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman



^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/16] nvme-auth: rename __nvme_auth_[reset|free] to nvme_auth[reset|free]_dhchap
  2022-11-09  3:44 ` [PATCH 01/16] nvme-auth: rename __nvme_auth_[reset|free] to nvme_auth[reset|free]_dhchap Sagi Grimberg
@ 2022-11-09 17:35   ` Hannes Reinecke
  0 siblings, 0 replies; 43+ messages in thread
From: Hannes Reinecke @ 2022-11-09 17:35 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme
  Cc: Christoph Hellwig, Keith Busch, Chaitanya Kulkarni

On 11/9/22 04:44, Sagi Grimberg wrote:
> nvme_auth_[reset|free] operate on the controller while
> __nvme_auth_[reset|free] operate on a chap struct (which maps to a queue
> context). Rename it for clarity.
> 
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
> ---
>   drivers/nvme/host/auth.c | 12 ++++++------
>   1 file changed, 6 insertions(+), 6 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman



^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/16] nvme-auth: remove redundant buffer deallocations
  2022-11-09  3:44 ` [PATCH 04/16] nvme-auth: remove redundant buffer deallocations Sagi Grimberg
@ 2022-11-09 17:36   ` Hannes Reinecke
  0 siblings, 0 replies; 43+ messages in thread
From: Hannes Reinecke @ 2022-11-09 17:36 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme
  Cc: Christoph Hellwig, Keith Busch, Chaitanya Kulkarni

On 11/9/22 04:44, Sagi Grimberg wrote:
> host_response, host_key, ctrl_key and sess_key are
> freed in nvme_auth_reset_dhchap which is called from
> nvme_auth_free_dhchap.
> 
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
> ---
>   drivers/nvme/host/auth.c | 4 ----
>   1 file changed, 4 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman



^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 08/16] nvme-auth: don't keep long lived 4k dhchap buffer
  2022-11-09  3:44 ` [PATCH 08/16] nvme-auth: don't keep long lived 4k dhchap buffer Sagi Grimberg
  2022-11-09  6:39   ` Christoph Hellwig
@ 2022-11-09 17:37   ` Hannes Reinecke
  1 sibling, 0 replies; 43+ messages in thread
From: Hannes Reinecke @ 2022-11-09 17:37 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme
  Cc: Christoph Hellwig, Keith Busch, Chaitanya Kulkarni

On 11/9/22 04:44, Sagi Grimberg wrote:
> dhchap structure is per-queue, it is wasteful to keep it for the entire
> lifetime of the queue. Allocate it dynamically and get rid of it after
> authentication. We don't need kzalloc because all accessors are clearing
> it before writing to it.
> 
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
> ---
>   drivers/nvme/host/auth.c | 27 +++++++++++++--------------
>   1 file changed, 13 insertions(+), 14 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman



^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 12/16] nvme-auth: convert dhchap_auth_list to an array
  2022-11-09  3:44 ` [PATCH 12/16] nvme-auth: convert dhchap_auth_list to an array Sagi Grimberg
  2022-11-09  6:48   ` Christoph Hellwig
@ 2022-11-09 17:40   ` Hannes Reinecke
  2022-11-09 19:05     ` Sagi Grimberg
  1 sibling, 1 reply; 43+ messages in thread
From: Hannes Reinecke @ 2022-11-09 17:40 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme
  Cc: Christoph Hellwig, Keith Busch, Chaitanya Kulkarni

On 11/9/22 04:44, Sagi Grimberg wrote:
> We know exactly how many dhchap contexts we will need, there is no need
> to hold a list that we need to protect with a mutex. Convert to
> a dynamically allocated array. And dhchap_context access state is
> maintained by the chap itself.
> 
> Make dhchap_auth_mutex protect only the ctrl host_key and ctrl_key
> in a fine-graned lock such that there is no long lasting acuisition
> of the lock.
> 
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
> ---
>   drivers/nvme/host/auth.c | 113 +++++++++++++++++++++------------------
>   drivers/nvme/host/core.c |   4 ++
>   drivers/nvme/host/nvme.h |   2 +-
>   3 files changed, 65 insertions(+), 54 deletions(-)
> 
> diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
> index 23d486284da9..b78e0df54f6c 100644
> --- a/drivers/nvme/host/auth.c
> +++ b/drivers/nvme/host/auth.c
> @@ -46,6 +46,14 @@ struct nvme_dhchap_queue_context {
>   	(qid == 0) ? 0 : BLK_MQ_REQ_NOWAIT | BLK_MQ_REQ_RESERVED
>   #define nvme_auth_queue_from_qid(ctrl, qid) \
>   	(qid == 0) ? (ctrl)->fabrics_q : (ctrl)->connect_q
> +#define nvme_auth_chap_from_qid(ctrl, qid) \
> +	&((struct nvme_dhchap_queue_context*)(ctrl)->dhchap_ctxs)[qid]
> +#define ctrl_max_dhchaps(ctrl) \
> +	(ctrl)->opts->nr_io_queues + (ctrl)->opts->nr_write_queues + \
> +			(ctrl)->opts->nr_poll_queues + 1
> +#define nvme_foreach_dhchap(i, chap, ctrl) \
> +	for (i = 0, chap = (ctrl)->dhchap_ctxs; \
> +	     i < ctrl_max_dhchaps(ctrl) && chap != NULL; i++, chap++)
>   
>   static int nvme_auth_submit(struct nvme_ctrl *ctrl, int qid,
>   			    void *data, size_t data_len, bool auth_send)

Please use an xarray (cf my patch doing the same).
That leads to far better readable code, and avoid having to do weird casts.

Also, the claim 'we know exactly how many queues' isn't quite true, as 
we only know the _maximum_ number of queues. The controller can (and 
will) decrease this number based on the target configuration.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman



^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 13/16] nvme-auth: remove redundant auth_work flush
  2022-11-09  3:44 ` [PATCH 13/16] nvme-auth: remove redundant auth_work flush Sagi Grimberg
@ 2022-11-09 17:41   ` Hannes Reinecke
  0 siblings, 0 replies; 43+ messages in thread
From: Hannes Reinecke @ 2022-11-09 17:41 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme
  Cc: Christoph Hellwig, Keith Busch, Chaitanya Kulkarni

On 11/9/22 04:44, Sagi Grimberg wrote:
> only ctrl deletion calls nvme_auth_free, which was stopped prior in the
> teardown stage, so there is no possibility that it should ever run when
> nvme_auth_free is called.
> 
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
> ---
>   drivers/nvme/host/auth.c | 4 +---
>   1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
> index b78e0df54f6c..0159179c2455 100644
> --- a/drivers/nvme/host/auth.c
> +++ b/drivers/nvme/host/auth.c
> @@ -988,10 +988,8 @@ void nvme_auth_free(struct nvme_ctrl *ctrl)
>   	struct nvme_dhchap_queue_context *chap;
>   	int i;
>   
> -	nvme_foreach_dhchap(i, chap, ctrl) {
> -		flush_work(&chap->auth_work);
> +	nvme_foreach_dhchap(i, chap, ctrl)
>   		nvme_auth_free_dhchap(chap);
> -	}
>   	if (ctrl->host_key) {
>   		nvme_auth_free_key(ctrl->host_key);
>   		ctrl->host_key = NULL;

Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman



^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 00/16] nvme: rework dhchap authentication host code
  2022-11-09  7:45 ` Hannes Reinecke
@ 2022-11-09 17:42   ` Sagi Grimberg
  0 siblings, 0 replies; 43+ messages in thread
From: Sagi Grimberg @ 2022-11-09 17:42 UTC (permalink / raw)
  To: Hannes Reinecke, linux-nvme
  Cc: Christoph Hellwig, Keith Busch, Chaitanya Kulkarni

[-- Attachment #1: Type: text/plain, Size: 2671 bytes --]

Are you still unable to see them?

set attached to this email.

On 11/9/22 09:45, Hannes Reinecke wrote:
> On 11/9/22 04:44, Sagi Grimberg wrote:
>> Currently the authentication code is fairly fragile with respect to
>> mutual locking between secrets sysfs override, re-authentication, and
>> controller resets.
>>
>> This patch set attempts to resolve these issues by:
>> 1. freeing queue chap context as soon as authentication completes
>> 2. allocates a simple vector for queue chap contexts so there is
>> no list/tree traversal to resolve queue chap context. queue chap
>> contexts are 1x1 mapped to queues, which are stored in a vector as
>> well.
>> 3. flush chap auth_work from the ctrl dhchap work, this simplifies
>> how we flush inflight authentication sequence
>> 4. use ctrl dhchap_auth_mutex to protect only the resources that are
>> accessed and modified via sysfs and the authentication flow (i.e. ctrl
>> host_key and ctrl_key)
>> 5. move drivers (rdma/tcp) nvme_auth_stop later in the error recovery
>> flow to expedite failover and not block on I/O.
>>
>>
>> Feedback is welcome.
>>
>> Sagi Grimberg (16):
>>    nvme-auth: rename __nvme_auth_[reset|free] to
>>      nvme_auth[reset|free]_dhchap
>>    nvme-auth: remove symbol export from nvme_auth_reset
>>    nvme-auth: don't re-authenticate if the controller is not LIVE
>>    nvme-auth: remove redundant buffer deallocations
>>    nvme-auth: don't ignore key generation failures when initializing ctrl
>>      keys
>>    nvme-auth: don't override ctrl keys before validation
>>    nvme-auth: remove redundant if statement
>>    nvme-auth: don't keep long lived 4k dhchap buffer
>>    nvme-auth: clear sensitive info right after authentication completes
>>    nvme-auth: remove redundant deallocations
>>    nvme-auth: no need to reset chap contexts on re-authentication
>>    nvme-auth: convert dhchap_auth_list to an array
>>    nvme-auth: remove redundant auth_work flush
>>    nvme-auth: have dhchap_auth_work wait for queues auth to complete
>>    nvme-tcp: stop auth work after tearing down queues in error recovery
>>    nvme-rdma: stop auth work after tearing down queues in error recovery
>>
>>   drivers/nvme/host/auth.c | 209 ++++++++++++++++++---------------------
>>   drivers/nvme/host/core.c |  26 +++--
>>   drivers/nvme/host/nvme.h |   5 +-
>>   drivers/nvme/host/rdma.c |   2 +-
>>   drivers/nvme/host/tcp.c  |   2 +-
>>   5 files changed, 121 insertions(+), 123 deletions(-)
>>
> 
> For some reason I'm missing parts of this patchset (patches 1, 4, 8, 10, 
> and 12); have they been eaten by my mailer?
> 
> Cheers,
> 
> Hannes

[-- Attachment #2: 0017-nvme-add-re-authentication-running-concurrently-with.patch --]
[-- Type: text/x-patch, Size: 5054 bytes --]

From 645039cf38cf960f0eaed3ad312048ef55128fc1 Mon Sep 17 00:00:00 2001
From: Sagi Grimberg <sagi@grimberg.me>
Date: Tue, 8 Nov 2022 03:36:39 +0200
Subject: [PATCH 17/16 blktests] nvme: add re-authentication running concurrently
 with reset

when re-authentication with new host-key and/or ctrl-key there is an
async work that is scheduled async. A controller reset should be mutual
exclusive with the re-authentication work. Add a test that triggers
re-authentication and immediately resets the controller.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 tests/nvme/046     | 112 +++++++++++++++++++++++++++++++++++++++++++++
 tests/nvme/046.out |  23 ++++++++++
 2 files changed, 135 insertions(+)
 create mode 100755 tests/nvme/046
 create mode 100644 tests/nvme/046.out

diff --git a/tests/nvme/046 b/tests/nvme/046
new file mode 100755
index 000000000000..2cd1b1bff4f1
--- /dev/null
+++ b/tests/nvme/046
@@ -0,0 +1,112 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-3.0+
+# Copyright (C) 2022 Hannes Reinecke, SUSE Labs
+#
+# Test re-authentication and reset running in parallel
+
+. tests/nvme/rc
+
+DESCRIPTION="Test re-authentication concurrently with reset"
+QUICK=1
+
+requires() {
+	_nvme_requires
+	_have_loop
+	_have_kernel_option NVME_AUTH
+	_have_kernel_option NVME_TARGET_AUTH
+	_require_nvme_trtype_is_fabrics
+	_require_nvme_cli_auth
+	_have_driver dh_generic
+}
+
+
+test() {
+	local port
+	local subsys_name="blktests-subsystem-1"
+	local hostid
+	local hostnqn
+	local file_path="${TMPDIR}/img"
+	local hostkey
+	local new_hostkey
+	local ctrlkey
+	local new_ctrlkey
+	local ctrldev
+
+	echo "Running ${TEST_NAME}"
+
+	hostid="$(uuidgen)"
+	if [ -z "$hostid" ] ; then
+		echo "uuidgen failed"
+		return 1
+	fi
+	hostnqn="nqn.2014-08.org.nvmexpress:uuid:${hostid}"
+
+	hostkey="$(nvme gen-dhchap-key -n ${subsys_name} 2> /dev/null)"
+	if [ -z "$hostkey" ] ; then
+		echo "failed to generate host key"
+		return 1
+	fi
+
+	ctrlkey="$(nvme gen-dhchap-key -n ${subsys_name} 2> /dev/null)"
+	if [ -z "$ctrlkey" ] ; then
+		echo "failed to generate ctrl key"
+		return 1
+	fi
+
+	_setup_nvmet
+
+	truncate -s 512M "${file_path}"
+
+	_create_nvmet_subsystem "${subsys_name}" "${file_path}"
+	port="$(_create_nvmet_port "${nvme_trtype}")"
+	_add_nvmet_subsys_to_port "${port}" "${subsys_name}"
+	_create_nvmet_host "${subsys_name}" "${hostnqn}" "${hostkey}" "${ctrlkey}"
+
+	_set_nvmet_dhgroup "${hostnqn}" "ffdhe2048"
+
+	_nvme_connect_subsys "${nvme_trtype}" "${subsys_name}" \
+			     "${def_traddr}" "${def_trsvcid}" \
+			     "${hostnqn}" "${hostid}" \
+			     "${hostkey}" "${ctrlkey}"
+
+	udevadm settle
+
+	ctrldev=$(_find_nvme_dev "${subsys_name}")
+	if [ -z "$ctrldev" ] ; then
+		echo "nvme controller not found"
+	fi
+
+	hostkey_file="/sys/class/nvme/${ctrldev}/dhchap_secret"
+	ctrlkey_file="/sys/class/nvme/${ctrldev}/dhchap_ctrl_secret"
+
+	for ((i = 0; i < 10; i++)); do
+		echo "Re-authenticate with new host key (iteration $i)"
+		new_hostkey="$(nvme gen-dhchap-key -n ${subsys_name} 2> /dev/null)"
+		_set_nvmet_hostkey "${hostnqn}" "${new_hostkey}"
+		echo "${new_hostkey}" > "${hostkey_file}"
+		_nvme_reset_ctrl $ctrldev
+
+		echo "Re-authenticate with new ctrl key (iteration $i)"
+		new_ctrlkey="$(nvme gen-dhchap-key -n ${subsys_name} 2> /dev/null)"
+		_set_nvmet_ctrlkey "${hostnqn}" "${new_ctrlkey}"
+		echo "${new_ctrlkey}" > "${ctrlkey_file}"
+		_nvme_reset_ctrl $ctrldev
+	done
+
+	nvmedev=$(_find_nvme_dev "${subsys_name}")
+
+	_run_fio_rand_io --size=8m --filename="/dev/${nvmedev}n1"
+
+	_nvme_disconnect_subsys "${subsys_name}"
+
+	_remove_nvmet_subsystem_from_port "${port}" "${subsys_name}"
+	_remove_nvmet_subsystem "${subsys_name}"
+
+	_remove_nvmet_port "${port}"
+
+	_remove_nvmet_host "${hostnqn}"
+
+	rm "${file_path}"
+
+	echo "Test complete"
+}
diff --git a/tests/nvme/046.out b/tests/nvme/046.out
new file mode 100644
index 000000000000..2d63f4e3112b
--- /dev/null
+++ b/tests/nvme/046.out
@@ -0,0 +1,23 @@
+Running nvme/046
+Re-authenticate with new host key (iteration 0)
+Re-authenticate with new ctrl key (iteration 0)
+Re-authenticate with new host key (iteration 1)
+Re-authenticate with new ctrl key (iteration 1)
+Re-authenticate with new host key (iteration 2)
+Re-authenticate with new ctrl key (iteration 2)
+Re-authenticate with new host key (iteration 3)
+Re-authenticate with new ctrl key (iteration 3)
+Re-authenticate with new host key (iteration 4)
+Re-authenticate with new ctrl key (iteration 4)
+Re-authenticate with new host key (iteration 5)
+Re-authenticate with new ctrl key (iteration 5)
+Re-authenticate with new host key (iteration 6)
+Re-authenticate with new ctrl key (iteration 6)
+Re-authenticate with new host key (iteration 7)
+Re-authenticate with new ctrl key (iteration 7)
+Re-authenticate with new host key (iteration 8)
+Re-authenticate with new ctrl key (iteration 8)
+Re-authenticate with new host key (iteration 9)
+Re-authenticate with new ctrl key (iteration 9)
+NQN:blktests-subsystem-1 disconnected 1 controller(s)
+Test complete
-- 
2.34.1


[-- Attachment #3: 0000-cover-letter.patch --]
[-- Type: text/x-patch, Size: 2388 bytes --]

From f3c5ad2f208aaf3f5e76da5f0b67a7a8fa50cb74 Mon Sep 17 00:00:00 2001
From: Sagi Grimberg <sagi@grimberg.me>
Date: Wed, 9 Nov 2022 05:30:33 +0200
Subject: [PATCH 00/16] nvme: rework dhchap authentication host code

Currently the authentication code is fairly fragile with respect to
mutual locking between secrets sysfs override, re-authentication, and
controller resets.

This patch set attempts to resolve these issues by:
1. freeing queue chap context as soon as authentication completes
2. allocates a simple vector for queue chap contexts so there is
no list/tree traversal to resolve queue chap context. queue chap
contexts are 1x1 mapped to queues, which are stored in a vector as
well.
3. flush chap auth_work from the ctrl dhchap work, this simplifies
how we flush inflight authentication sequence
4. use ctrl dhchap_auth_mutex to protect only the resources that are
accessed and modified via sysfs and the authentication flow (i.e. ctrl
host_key and ctrl_key)
5. move drivers (rdma/tcp) nvme_auth_stop later in the error recovery
flow to expedite failover and not block on I/O.


Feedback is welcome.

Sagi Grimberg (16):
  nvme-auth: rename __nvme_auth_[reset|free] to
    nvme_auth[reset|free]_dhchap
  nvme-auth: remove symbol export from nvme_auth_reset
  nvme-auth: don't re-authenticate if the controller is not LIVE
  nvme-auth: remove redundant buffer deallocations
  nvme-auth: don't ignore key generation failures when initializing ctrl
    keys
  nvme-auth: don't override ctrl keys before validation
  nvme-auth: remove redundant if statement
  nvme-auth: don't keep long lived 4k dhchap buffer
  nvme-auth: clear sensitive info right after authentication completes
  nvme-auth: remove redundant deallocations
  nvme-auth: no need to reset chap contexts on re-authentication
  nvme-auth: convert dhchap_auth_list to an array
  nvme-auth: remove redundant auth_work flush
  nvme-auth: have dhchap_auth_work wait for queues auth to complete
  nvme-tcp: stop auth work after tearing down queues in error recovery
  nvme-rdma: stop auth work after tearing down queues in error recovery

 drivers/nvme/host/auth.c | 209 ++++++++++++++++++---------------------
 drivers/nvme/host/core.c |  26 +++--
 drivers/nvme/host/nvme.h |   5 +-
 drivers/nvme/host/rdma.c |   2 +-
 drivers/nvme/host/tcp.c  |   2 +-
 5 files changed, 121 insertions(+), 123 deletions(-)

-- 
2.34.1


[-- Attachment #4: 0016-nvme-rdma-stop-auth-work-after-tearing-down-queues-i.patch --]
[-- Type: text/x-patch, Size: 1645 bytes --]

From f3c5ad2f208aaf3f5e76da5f0b67a7a8fa50cb74 Mon Sep 17 00:00:00 2001
From: Sagi Grimberg <sagi@grimberg.me>
Date: Sun, 6 Nov 2022 01:36:26 +0200
Subject: [PATCH 16/16] nvme-rdma: stop auth work after tearing down queues in
 error recovery

when starting error recovery there might be a authentication work
running, and it involves I/O commands. Given the controller is tearing
down there is no chance for the I/O to complete other than timing out
which may unnecessarily take a full io timeout.

So first tear down the queues, fail/cancel all inflight I/O (including
potentially authentication) and only then stop authentication. This
ensures that failover is not stalled due to blocked authentication I/O.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/rdma.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 6e079abb22ee..ff1a6924f339 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -1153,13 +1153,13 @@ static void nvme_rdma_error_recovery_work(struct work_struct *work)
 	struct nvme_rdma_ctrl *ctrl = container_of(work,
 			struct nvme_rdma_ctrl, err_work);
 
-	nvme_auth_stop(&ctrl->ctrl);
 	nvme_stop_keep_alive(&ctrl->ctrl);
 	flush_work(&ctrl->ctrl.async_event_work);
 	nvme_rdma_teardown_io_queues(ctrl, false);
 	nvme_start_queues(&ctrl->ctrl);
 	nvme_rdma_teardown_admin_queue(ctrl, false);
 	nvme_start_admin_queue(&ctrl->ctrl);
+	nvme_auth_stop(&ctrl->ctrl);
 
 	if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING)) {
 		/* state change failure is ok if we started ctrl delete */
-- 
2.34.1


[-- Attachment #5: 0015-nvme-tcp-stop-auth-work-after-tearing-down-queues-in.patch --]
[-- Type: text/x-patch, Size: 1669 bytes --]

From e5e211f7a36235884f57dd84529205b76a619035 Mon Sep 17 00:00:00 2001
From: Sagi Grimberg <sagi@grimberg.me>
Date: Sun, 6 Nov 2022 01:32:10 +0200
Subject: [PATCH 15/16] nvme-tcp: stop auth work after tearing down queues in
 error recovery

when starting error recovery there might be a authentication work
running, and it involves I/O commands. Given the controller is tearing
down there is no chance for the I/O to complete other than timing out
which may unnecessarily take a full io timeout.

So first tear down the queues, fail/cancel all inflight I/O (including
potentially authentication) and only then stop authentication. This
ensures that failover is not stalled due to blocked authentication I/O.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/tcp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 339dc9e94f5c..d78fdfac635e 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -2128,7 +2128,6 @@ static void nvme_tcp_error_recovery_work(struct work_struct *work)
 				struct nvme_tcp_ctrl, err_work);
 	struct nvme_ctrl *ctrl = &tcp_ctrl->ctrl;
 
-	nvme_auth_stop(ctrl);
 	nvme_stop_keep_alive(ctrl);
 	flush_work(&ctrl->async_event_work);
 	nvme_tcp_teardown_io_queues(ctrl, false);
@@ -2136,6 +2135,7 @@ static void nvme_tcp_error_recovery_work(struct work_struct *work)
 	nvme_start_queues(ctrl);
 	nvme_tcp_teardown_admin_queue(ctrl, false);
 	nvme_start_admin_queue(ctrl);
+	nvme_auth_stop(ctrl);
 
 	if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_CONNECTING)) {
 		/* state change failure is ok if we started ctrl delete */
-- 
2.34.1


[-- Attachment #6: 0014-nvme-auth-have-dhchap_auth_work-wait-for-queues-auth.patch --]
[-- Type: text/x-patch, Size: 1566 bytes --]

From 17617d1b4d04cdbf3e4e14c8ff7555cad216725e Mon Sep 17 00:00:00 2001
From: Sagi Grimberg <sagi@grimberg.me>
Date: Sun, 6 Nov 2022 01:28:21 +0200
Subject: [PATCH 14/16] nvme-auth: have dhchap_auth_work wait for queues auth
 to complete

It triggered the queue authentication work elements in parallel, but
the ctrl authentication work itself completes when all of them
completes. Hence wait for queues auth completions.

This also makes nvme_auth_stop simply a sync cancel of ctrl
dhchap_auth_work.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/auth.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
index 0159179c2455..e88d6254f2f5 100644
--- a/drivers/nvme/host/auth.c
+++ b/drivers/nvme/host/auth.c
@@ -924,6 +924,12 @@ static void nvme_dhchap_auth_work(struct work_struct *work)
 	 * Failure is a soft-state; credentials remain valid until
 	 * the controller terminates the connection.
 	 */
+	for (q = 1; q < ctrl->queue_count; q++) {
+		ret = nvme_auth_wait(ctrl, q);
+		if (ret)
+			dev_warn(ctrl->device,
+				 "qid %d: authentication failed\n", q);
+	}
 }
 
 int nvme_auth_init_ctrl(struct nvme_ctrl *ctrl)
@@ -974,12 +980,7 @@ EXPORT_SYMBOL_GPL(nvme_auth_init_ctrl);
 
 void nvme_auth_stop(struct nvme_ctrl *ctrl)
 {
-	struct nvme_dhchap_queue_context *chap;
-	int i;
-
 	cancel_work_sync(&ctrl->dhchap_auth_work);
-	nvme_foreach_dhchap(i, chap, ctrl)
-		cancel_work_sync(&chap->auth_work);
 }
 EXPORT_SYMBOL_GPL(nvme_auth_stop);
 
-- 
2.34.1


[-- Attachment #7: 0013-nvme-auth-remove-redundant-auth_work-flush.patch --]
[-- Type: text/x-patch, Size: 1053 bytes --]

From 2f192152a043094702bd50ae04ffbb43c8abdbd9 Mon Sep 17 00:00:00 2001
From: Sagi Grimberg <sagi@grimberg.me>
Date: Sun, 6 Nov 2022 01:22:38 +0200
Subject: [PATCH 13/16] nvme-auth: remove redundant auth_work flush

only ctrl deletion calls nvme_auth_free, which was stopped prior in the
teardown stage, so there is no possibility that it should ever run when
nvme_auth_free is called.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/auth.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
index b78e0df54f6c..0159179c2455 100644
--- a/drivers/nvme/host/auth.c
+++ b/drivers/nvme/host/auth.c
@@ -988,10 +988,8 @@ void nvme_auth_free(struct nvme_ctrl *ctrl)
 	struct nvme_dhchap_queue_context *chap;
 	int i;
 
-	nvme_foreach_dhchap(i, chap, ctrl) {
-		flush_work(&chap->auth_work);
+	nvme_foreach_dhchap(i, chap, ctrl)
 		nvme_auth_free_dhchap(chap);
-	}
 	if (ctrl->host_key) {
 		nvme_auth_free_key(ctrl->host_key);
 		ctrl->host_key = NULL;
-- 
2.34.1


[-- Attachment #8: 0012-nvme-auth-convert-dhchap_auth_list-to-an-array.patch --]
[-- Type: text/x-patch, Size: 9587 bytes --]

From 732c22abeb49159c87d42f1fcb94143e521dc9c0 Mon Sep 17 00:00:00 2001
From: Sagi Grimberg <sagi@grimberg.me>
Date: Thu, 27 Oct 2022 15:46:11 +0300
Subject: [PATCH 12/16] nvme-auth: convert dhchap_auth_list to an array

We know exactly how many dhchap contexts we will need, there is no need
to hold a list that we need to protect with a mutex. Convert to
a dynamically allocated array. And dhchap_context access state is
maintained by the chap itself.

Make dhchap_auth_mutex protect only the ctrl host_key and ctrl_key
in a fine-graned lock such that there is no long lasting acuisition
of the lock.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/auth.c | 113 +++++++++++++++++++++------------------
 drivers/nvme/host/core.c |   4 ++
 drivers/nvme/host/nvme.h |   2 +-
 3 files changed, 65 insertions(+), 54 deletions(-)

diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
index 23d486284da9..b78e0df54f6c 100644
--- a/drivers/nvme/host/auth.c
+++ b/drivers/nvme/host/auth.c
@@ -46,6 +46,14 @@ struct nvme_dhchap_queue_context {
 	(qid == 0) ? 0 : BLK_MQ_REQ_NOWAIT | BLK_MQ_REQ_RESERVED
 #define nvme_auth_queue_from_qid(ctrl, qid) \
 	(qid == 0) ? (ctrl)->fabrics_q : (ctrl)->connect_q
+#define nvme_auth_chap_from_qid(ctrl, qid) \
+	&((struct nvme_dhchap_queue_context*)(ctrl)->dhchap_ctxs)[qid]
+#define ctrl_max_dhchaps(ctrl) \
+	(ctrl)->opts->nr_io_queues + (ctrl)->opts->nr_write_queues + \
+			(ctrl)->opts->nr_poll_queues + 1
+#define nvme_foreach_dhchap(i, chap, ctrl) \
+	for (i = 0, chap = (ctrl)->dhchap_ctxs; \
+	     i < ctrl_max_dhchaps(ctrl) && chap != NULL; i++, chap++)
 
 static int nvme_auth_submit(struct nvme_ctrl *ctrl, int qid,
 			    void *data, size_t data_len, bool auth_send)
@@ -330,7 +338,7 @@ static int nvme_auth_process_dhchap_success1(struct nvme_ctrl *ctrl,
 	struct nvmf_auth_dhchap_success1_data *data = chap->buf;
 	size_t size = sizeof(*data);
 
-	if (ctrl->ctrl_key)
+	if (chap->ctrl_key)
 		size += chap->hash_len;
 
 	if (chap->buf_size < size) {
@@ -418,8 +426,10 @@ static int nvme_auth_dhchap_setup_host_response(struct nvme_ctrl *ctrl,
 		__func__, chap->qid, chap->s1, chap->transaction);
 
 	if (!chap->host_response) {
+		mutex_lock(&ctrl->dhchap_auth_mutex);
 		chap->host_response = nvme_auth_transform_key(ctrl->host_key,
 						ctrl->opts->host->nqn);
+		mutex_unlock(&ctrl->dhchap_auth_mutex);
 		if (IS_ERR(chap->host_response)) {
 			ret = PTR_ERR(chap->host_response);
 			chap->host_response = NULL;
@@ -430,8 +440,10 @@ static int nvme_auth_dhchap_setup_host_response(struct nvme_ctrl *ctrl,
 			__func__, chap->qid);
 	}
 
+	mutex_lock(&ctrl->dhchap_auth_mutex);
 	ret = crypto_shash_setkey(chap->shash_tfm,
 			chap->host_response, ctrl->host_key->len);
+	mutex_unlock(&ctrl->dhchap_auth_mutex);
 	if (ret) {
 		dev_warn(ctrl->device, "qid %d: failed to set key, error %d\n",
 			 chap->qid, ret);
@@ -507,6 +519,7 @@ static int nvme_auth_dhchap_setup_ctrl_response(struct nvme_ctrl *ctrl,
 		ret = PTR_ERR(ctrl_response);
 		return ret;
 	}
+
 	ret = crypto_shash_setkey(chap->shash_tfm,
 			ctrl_response, ctrl->ctrl_key->len);
 	if (ret) {
@@ -665,7 +678,6 @@ static void nvme_auth_free_dhchap(struct nvme_dhchap_queue_context *chap)
 		crypto_free_shash(chap->shash_tfm);
 	if (chap->dh_tfm)
 		crypto_free_kpp(chap->dh_tfm);
-	kfree(chap);
 }
 
 static void __nvme_auth_work(struct work_struct *work)
@@ -789,6 +801,7 @@ static void __nvme_auth_work(struct work_struct *work)
 		return;
 	}
 
+	mutex_lock(&ctrl->dhchap_auth_mutex);
 	if (ctrl->ctrl_key) {
 		dev_dbg(ctrl->device,
 			"%s: qid %d controller response\n",
@@ -816,6 +829,7 @@ static void __nvme_auth_work(struct work_struct *work)
 		if (ret)
 			chap->error = ret;
 	}
+	mutex_unlock(&ctrl->dhchap_auth_mutex);
 	if (!ret) {
 		chap->error = 0;
 		return;
@@ -848,29 +862,8 @@ int nvme_auth_negotiate(struct nvme_ctrl *ctrl, int qid)
 		return -ENOKEY;
 	}
 
-	mutex_lock(&ctrl->dhchap_auth_mutex);
-	/* Check if the context is already queued */
-	list_for_each_entry(chap, &ctrl->dhchap_auth_list, entry) {
-		WARN_ON(!chap->buf);
-		if (chap->qid == qid) {
-			dev_dbg(ctrl->device, "qid %d: re-using context\n", qid);
-			mutex_unlock(&ctrl->dhchap_auth_mutex);
-			flush_work(&chap->auth_work);
-			nvme_auth_reset_dhchap(chap);
-			queue_work(nvme_wq, &chap->auth_work);
-			return 0;
-		}
-	}
-	chap = kzalloc(sizeof(*chap), GFP_KERNEL);
-	if (!chap) {
-		mutex_unlock(&ctrl->dhchap_auth_mutex);
-		return -ENOMEM;
-	}
-	chap->qid = qid;
-	chap->ctrl = ctrl;
-	INIT_WORK(&chap->auth_work, __nvme_auth_work);
-	list_add(&chap->entry, &ctrl->dhchap_auth_list);
-	mutex_unlock(&ctrl->dhchap_auth_mutex);
+	chap = nvme_auth_chap_from_qid(ctrl, qid);
+	cancel_work_sync(&chap->auth_work);
 	queue_work(nvme_wq, &chap->auth_work);
 	return 0;
 }
@@ -881,19 +874,12 @@ int nvme_auth_wait(struct nvme_ctrl *ctrl, int qid)
 	struct nvme_dhchap_queue_context *chap;
 	int ret;
 
-	mutex_lock(&ctrl->dhchap_auth_mutex);
-	list_for_each_entry(chap, &ctrl->dhchap_auth_list, entry) {
-		if (chap->qid != qid)
-			continue;
-		mutex_unlock(&ctrl->dhchap_auth_mutex);
-		flush_work(&chap->auth_work);
-		ret = chap->error;
-		/* clear sensitive info */
-		nvme_auth_reset_dhchap(chap);
-		return ret;
-	}
-	mutex_unlock(&ctrl->dhchap_auth_mutex);
-	return -ENXIO;
+	chap = nvme_auth_chap_from_qid(ctrl, qid);
+	flush_work(&chap->auth_work);
+	ret = chap->error;
+	/* clear sensitive info */
+	nvme_auth_reset_dhchap(chap);
+	return ret;
 }
 EXPORT_SYMBOL_GPL(nvme_auth_wait);
 
@@ -942,11 +928,11 @@ static void nvme_dhchap_auth_work(struct work_struct *work)
 
 int nvme_auth_init_ctrl(struct nvme_ctrl *ctrl)
 {
-	int ret;
+	struct nvme_dhchap_queue_context *chap;
+	int i, ret;
 
-	INIT_LIST_HEAD(&ctrl->dhchap_auth_list);
-	INIT_WORK(&ctrl->dhchap_auth_work, nvme_dhchap_auth_work);
 	mutex_init(&ctrl->dhchap_auth_mutex);
+	INIT_WORK(&ctrl->dhchap_auth_work, nvme_dhchap_auth_work);
 	if (!ctrl->opts)
 		return 0;
 	ret = nvme_auth_generate_key(ctrl->opts->dhchap_secret,
@@ -955,37 +941,57 @@ int nvme_auth_init_ctrl(struct nvme_ctrl *ctrl)
 		return ret;
 	ret = nvme_auth_generate_key(ctrl->opts->dhchap_ctrl_secret,
 			&ctrl->ctrl_key);
-	if (ret) {
-		nvme_auth_free_key(ctrl->host_key);
-		ctrl->host_key = NULL;
+	if (ret)
+		goto err_free_dhchap_secret;
+
+	if (!ctrl->opts->dhchap_secret && !ctrl->opts->dhchap_ctrl_secret)
+		return ret;
+
+	ctrl->dhchap_ctxs = kvcalloc(ctrl_max_dhchaps(ctrl),
+				sizeof(*chap), GFP_KERNEL);
+	if (!ctrl->dhchap_ctxs) {
+		ret = -ENOMEM;
+		goto err_free_dhchap_ctrl_secret;
 	}
+
+	nvme_foreach_dhchap(i, chap, ctrl) {
+		chap->qid = i;
+		chap->ctrl = ctrl;
+		INIT_WORK(&chap->auth_work, __nvme_auth_work);
+	}
+
+out:
 	return ret;
+err_free_dhchap_ctrl_secret:
+	nvme_auth_free_key(ctrl->ctrl_key);
+	ctrl->ctrl_key = NULL;
+err_free_dhchap_secret:
+	nvme_auth_free_key(ctrl->host_key);
+	ctrl->host_key = NULL;
+	goto out;
 }
 EXPORT_SYMBOL_GPL(nvme_auth_init_ctrl);
 
 void nvme_auth_stop(struct nvme_ctrl *ctrl)
 {
-	struct nvme_dhchap_queue_context *chap = NULL, *tmp;
+	struct nvme_dhchap_queue_context *chap;
+	int i;
 
 	cancel_work_sync(&ctrl->dhchap_auth_work);
-	mutex_lock(&ctrl->dhchap_auth_mutex);
-	list_for_each_entry_safe(chap, tmp, &ctrl->dhchap_auth_list, entry)
+	nvme_foreach_dhchap(i, chap, ctrl)
 		cancel_work_sync(&chap->auth_work);
-	mutex_unlock(&ctrl->dhchap_auth_mutex);
 }
 EXPORT_SYMBOL_GPL(nvme_auth_stop);
 
 void nvme_auth_free(struct nvme_ctrl *ctrl)
 {
-	struct nvme_dhchap_queue_context *chap = NULL, *tmp;
+	struct nvme_dhchap_queue_context *chap;
+	int i;
 
-	mutex_lock(&ctrl->dhchap_auth_mutex);
-	list_for_each_entry_safe(chap, tmp, &ctrl->dhchap_auth_list, entry) {
-		list_del_init(&chap->entry);
+	nvme_foreach_dhchap(i, chap, ctrl) {
 		flush_work(&chap->auth_work);
 		nvme_auth_free_dhchap(chap);
 	}
-	mutex_unlock(&ctrl->dhchap_auth_mutex);
 	if (ctrl->host_key) {
 		nvme_auth_free_key(ctrl->host_key);
 		ctrl->host_key = NULL;
@@ -994,5 +1000,6 @@ void nvme_auth_free(struct nvme_ctrl *ctrl)
 		nvme_auth_free_key(ctrl->ctrl_key);
 		ctrl->ctrl_key = NULL;
 	}
+	kfree(ctrl->dhchap_ctxs);
 }
 EXPORT_SYMBOL_GPL(nvme_auth_free);
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index c15dbbe509f5..d8e6ff4257e9 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3755,7 +3755,9 @@ static ssize_t nvme_ctrl_dhchap_secret_store(struct device *dev,
 		kfree(opts->dhchap_secret);
 		opts->dhchap_secret = dhchap_secret;
 		host_key = ctrl->host_key;
+		mutex_lock(&ctrl->dhchap_auth_mutex);
 		ctrl->host_key = key;
+		mutex_unlock(&ctrl->dhchap_auth_mutex);
 		nvme_auth_free_key(host_key);
 	}
 	/* Start re-authentication */
@@ -3807,7 +3809,9 @@ static ssize_t nvme_ctrl_dhchap_ctrl_secret_store(struct device *dev,
 		kfree(opts->dhchap_ctrl_secret);
 		opts->dhchap_ctrl_secret = dhchap_secret;
 		ctrl_key = ctrl->ctrl_key;
+		mutex_lock(&ctrl->dhchap_auth_mutex);
 		ctrl->ctrl_key = key;
+		mutex_unlock(&ctrl->dhchap_auth_mutex);
 		nvme_auth_free_key(ctrl_key);
 	}
 	/* Start re-authentication */
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index f9708825aa24..7d12759857d7 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -336,8 +336,8 @@ struct nvme_ctrl {
 
 #ifdef CONFIG_NVME_AUTH
 	struct work_struct dhchap_auth_work;
-	struct list_head dhchap_auth_list;
 	struct mutex dhchap_auth_mutex;
+	void *dhchap_ctxs;
 	struct nvme_dhchap_key *host_key;
 	struct nvme_dhchap_key *ctrl_key;
 	u16 transaction;
-- 
2.34.1


[-- Attachment #9: 0011-nvme-auth-no-need-to-reset-chap-contexts-on-re-authe.patch --]
[-- Type: text/x-patch, Size: 2752 bytes --]

From a4f1dc914acb0eb2edccc9fa5e01bf87ab02c258 Mon Sep 17 00:00:00 2001
From: Sagi Grimberg <sagi@grimberg.me>
Date: Mon, 24 Oct 2022 14:12:47 +0300
Subject: [PATCH 11/16] nvme-auth: no need to reset chap contexts on
 re-authentication

Now that the chap context is reset upon completion, this is no longer
needed. Also remove nvme_auth_reset as no callers are left.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/auth.c | 13 -------------
 drivers/nvme/host/core.c |  4 ----
 drivers/nvme/host/nvme.h |  1 -
 3 files changed, 18 deletions(-)

diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
index 9218f50d8be3..23d486284da9 100644
--- a/drivers/nvme/host/auth.c
+++ b/drivers/nvme/host/auth.c
@@ -897,19 +897,6 @@ int nvme_auth_wait(struct nvme_ctrl *ctrl, int qid)
 }
 EXPORT_SYMBOL_GPL(nvme_auth_wait);
 
-void nvme_auth_reset(struct nvme_ctrl *ctrl)
-{
-	struct nvme_dhchap_queue_context *chap;
-
-	mutex_lock(&ctrl->dhchap_auth_mutex);
-	list_for_each_entry(chap, &ctrl->dhchap_auth_list, entry) {
-		mutex_unlock(&ctrl->dhchap_auth_mutex);
-		flush_work(&chap->auth_work);
-		nvme_auth_reset_dhchap(chap);
-	}
-	mutex_unlock(&ctrl->dhchap_auth_mutex);
-}
-
 static void nvme_dhchap_auth_work(struct work_struct *work)
 {
 	struct nvme_ctrl *ctrl =
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 4aa855a4839a..c15dbbe509f5 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3757,8 +3757,6 @@ static ssize_t nvme_ctrl_dhchap_secret_store(struct device *dev,
 		host_key = ctrl->host_key;
 		ctrl->host_key = key;
 		nvme_auth_free_key(host_key);
-		/* Key has changed; re-authentication with new key */
-		nvme_auth_reset(ctrl);
 	}
 	/* Start re-authentication */
 	dev_info(ctrl->device, "re-authenticating controller\n");
@@ -3811,8 +3809,6 @@ static ssize_t nvme_ctrl_dhchap_ctrl_secret_store(struct device *dev,
 		ctrl_key = ctrl->ctrl_key;
 		ctrl->ctrl_key = key;
 		nvme_auth_free_key(ctrl_key);
-		/* Key has changed; re-authentication with new key */
-		nvme_auth_reset(ctrl);
 	}
 	/* Start re-authentication */
 	dev_info(ctrl->device, "re-authenticating controller\n");
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index cc314f45ddee..f9708825aa24 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -1023,7 +1023,6 @@ int nvme_auth_init_ctrl(struct nvme_ctrl *ctrl);
 void nvme_auth_stop(struct nvme_ctrl *ctrl);
 int nvme_auth_negotiate(struct nvme_ctrl *ctrl, int qid);
 int nvme_auth_wait(struct nvme_ctrl *ctrl, int qid);
-void nvme_auth_reset(struct nvme_ctrl *ctrl);
 void nvme_auth_free(struct nvme_ctrl *ctrl);
 #else
 static inline void nvme_auth_init_ctrl(struct nvme_ctrl *ctrl) {};
-- 
2.34.1


[-- Attachment #10: 0010-nvme-auth-remove-redundant-deallocations.patch --]
[-- Type: text/x-patch, Size: 2216 bytes --]

From 7105557c021e1c2bc67e19f5ec6449ee03c7d0ca Mon Sep 17 00:00:00 2001
From: Sagi Grimberg <sagi@grimberg.me>
Date: Thu, 27 Oct 2022 15:48:47 +0300
Subject: [PATCH 10/16] nvme-auth: remove redundant deallocations

These are now redundant as the dhchap context is
removed after authentication completes.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/auth.c | 20 --------------------
 1 file changed, 20 deletions(-)

diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
index 484315efa0b2..9218f50d8be3 100644
--- a/drivers/nvme/host/auth.c
+++ b/drivers/nvme/host/auth.c
@@ -197,12 +197,6 @@ static int nvme_auth_process_dhchap_challenge(struct nvme_ctrl *ctrl,
 		return NVME_SC_AUTH_REQUIRED;
 	}
 
-	/* Reset host response if the hash had been changed */
-	if (chap->hash_id != data->hashid) {
-		kfree(chap->host_response);
-		chap->host_response = NULL;
-	}
-
 	chap->hash_id = data->hashid;
 	chap->hash_len = data->hl;
 	dev_dbg(ctrl->device, "qid %d: selected hash %s\n",
@@ -219,14 +213,6 @@ static int nvme_auth_process_dhchap_challenge(struct nvme_ctrl *ctrl,
 		return NVME_SC_AUTH_REQUIRED;
 	}
 
-	/* Clear host and controller key to avoid accidental reuse */
-	kfree_sensitive(chap->host_key);
-	chap->host_key = NULL;
-	chap->host_key_len = 0;
-	kfree_sensitive(chap->ctrl_key);
-	chap->ctrl_key = NULL;
-	chap->ctrl_key_len = 0;
-
 	if (chap->dhgroup_id == data->dhgid &&
 	    (data->dhgid == NVME_AUTH_DHGROUP_NULL || chap->dh_tfm)) {
 		dev_dbg(ctrl->device,
@@ -621,9 +607,6 @@ static int nvme_auth_dhchap_exponential(struct nvme_ctrl *ctrl,
 	if (ret) {
 		dev_dbg(ctrl->device,
 			"failed to generate public key, error %d\n", ret);
-		kfree(chap->host_key);
-		chap->host_key = NULL;
-		chap->host_key_len = 0;
 		chap->status = NVME_AUTH_DHCHAP_FAILURE_INCORRECT_PAYLOAD;
 		return ret;
 	}
@@ -643,9 +626,6 @@ static int nvme_auth_dhchap_exponential(struct nvme_ctrl *ctrl,
 	if (ret) {
 		dev_dbg(ctrl->device,
 			"failed to generate shared secret, error %d\n", ret);
-		kfree_sensitive(chap->sess_key);
-		chap->sess_key = NULL;
-		chap->sess_key_len = 0;
 		chap->status = NVME_AUTH_DHCHAP_FAILURE_INCORRECT_PAYLOAD;
 		return ret;
 	}
-- 
2.34.1


[-- Attachment #11: 0009-nvme-auth-clear-sensitive-info-right-after-authentic.patch --]
[-- Type: text/x-patch, Size: 927 bytes --]

From 9f0a4d87f90bb547c965b661eae3b6851586c15e Mon Sep 17 00:00:00 2001
From: Sagi Grimberg <sagi@grimberg.me>
Date: Mon, 24 Oct 2022 13:59:17 +0300
Subject: [PATCH 09/16] nvme-auth: clear sensitive info right after
 authentication completes

We don't want to keep authentication sensitive info in memory for unlimited
amount of time.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/auth.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
index 201f25267685..484315efa0b2 100644
--- a/drivers/nvme/host/auth.c
+++ b/drivers/nvme/host/auth.c
@@ -908,6 +908,8 @@ int nvme_auth_wait(struct nvme_ctrl *ctrl, int qid)
 		mutex_unlock(&ctrl->dhchap_auth_mutex);
 		flush_work(&chap->auth_work);
 		ret = chap->error;
+		/* clear sensitive info */
+		nvme_auth_reset_dhchap(chap);
 		return ret;
 	}
 	mutex_unlock(&ctrl->dhchap_auth_mutex);
-- 
2.34.1


[-- Attachment #12: 0008-nvme-auth-don-t-keep-long-lived-4k-dhchap-buffer.patch --]
[-- Type: text/x-patch, Size: 2363 bytes --]

From 5025b31b726c495b0562a8125b10d34658af7397 Mon Sep 17 00:00:00 2001
From: Sagi Grimberg <sagi@grimberg.me>
Date: Thu, 27 Oct 2022 15:06:48 +0300
Subject: [PATCH 08/16] nvme-auth: don't keep long lived 4k dhchap buffer

dhchap structure is per-queue, it is wasteful to keep it for the entire
lifetime of the queue. Allocate it dynamically and get rid of it after
authentication. We don't need kzalloc because all accessors are clearing
it before writing to it.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/auth.c | 27 +++++++++++++--------------
 1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
index 07d497f49e4b..201f25267685 100644
--- a/drivers/nvme/host/auth.c
+++ b/drivers/nvme/host/auth.c
@@ -674,6 +674,8 @@ static void nvme_auth_reset_dhchap(struct nvme_dhchap_queue_context *chap)
 	chap->transaction = 0;
 	memset(chap->c1, 0, sizeof(chap->c1));
 	memset(chap->c2, 0, sizeof(chap->c2));
+	kfree(chap->buf);
+	chap->buf = NULL;
 }
 
 static void nvme_auth_free_dhchap(struct nvme_dhchap_queue_context *chap)
@@ -683,7 +685,6 @@ static void nvme_auth_free_dhchap(struct nvme_dhchap_queue_context *chap)
 		crypto_free_shash(chap->shash_tfm);
 	if (chap->dh_tfm)
 		crypto_free_kpp(chap->dh_tfm);
-	kfree(chap->buf);
 	kfree(chap);
 }
 
@@ -695,6 +696,17 @@ static void __nvme_auth_work(struct work_struct *work)
 	size_t tl;
 	int ret = 0;
 
+	/*
+	 * Allocate a large enough buffer for the entire negotiation:
+	 * 4k should be enough to ffdhe8192.
+	 */
+	chap->buf_size = 4096;
+	chap->buf = kmalloc(chap->buf_size, GFP_KERNEL);
+	if (!chap->buf) {
+		chap->error = -ENOMEM;
+		return;
+	}
+
 	chap->transaction = ctrl->transaction++;
 
 	/* DH-HMAC-CHAP Step 1: send negotiate */
@@ -876,19 +888,6 @@ int nvme_auth_negotiate(struct nvme_ctrl *ctrl, int qid)
 	}
 	chap->qid = qid;
 	chap->ctrl = ctrl;
-
-	/*
-	 * Allocate a large enough buffer for the entire negotiation:
-	 * 4k should be enough to ffdhe8192.
-	 */
-	chap->buf_size = 4096;
-	chap->buf = kzalloc(chap->buf_size, GFP_KERNEL);
-	if (!chap->buf) {
-		mutex_unlock(&ctrl->dhchap_auth_mutex);
-		kfree(chap);
-		return -ENOMEM;
-	}
-
 	INIT_WORK(&chap->auth_work, __nvme_auth_work);
 	list_add(&chap->entry, &ctrl->dhchap_auth_list);
 	mutex_unlock(&ctrl->dhchap_auth_mutex);
-- 
2.34.1


[-- Attachment #13: 0007-nvme-auth-remove-redundant-if-statement.patch --]
[-- Type: text/x-patch, Size: 814 bytes --]

From 4a419620b1fa56396839d46dbd7cbd0cbbb18cab Mon Sep 17 00:00:00 2001
From: Sagi Grimberg <sagi@grimberg.me>
Date: Thu, 27 Oct 2022 15:02:52 +0300
Subject: [PATCH 07/16] nvme-auth: remove redundant if statement

No one passes NVME_QID_ANY to nvme_auth_negotiate.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/auth.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
index a6312a8846f8..07d497f49e4b 100644
--- a/drivers/nvme/host/auth.c
+++ b/drivers/nvme/host/auth.c
@@ -874,7 +874,7 @@ int nvme_auth_negotiate(struct nvme_ctrl *ctrl, int qid)
 		mutex_unlock(&ctrl->dhchap_auth_mutex);
 		return -ENOMEM;
 	}
-	chap->qid = (qid == NVME_QID_ANY) ? 0 : qid;
+	chap->qid = qid;
 	chap->ctrl = ctrl;
 
 	/*
-- 
2.34.1


[-- Attachment #14: 0006-nvme-auth-don-t-override-ctrl-keys-before-validation.patch --]
[-- Type: text/x-patch, Size: 2007 bytes --]

From b5762785d24f4dd4f68a7514b45bf1091ec96005 Mon Sep 17 00:00:00 2001
From: Sagi Grimberg <sagi@grimberg.me>
Date: Mon, 24 Oct 2022 14:24:09 +0300
Subject: [PATCH 06/16] nvme-auth: don't override ctrl keys before validation

Replace ctrl ctrl_key/host_key only after nvme_auth_generate_key is successful.
Also, this fixes a bug where the keys are leaked.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/core.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 09f2b44fe771..4aa855a4839a 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3746,13 +3746,17 @@ static ssize_t nvme_ctrl_dhchap_secret_store(struct device *dev,
 	memcpy(dhchap_secret, buf, count);
 	nvme_auth_stop(ctrl);
 	if (strcmp(dhchap_secret, opts->dhchap_secret)) {
+		struct nvme_dhchap_key *key, *host_key;
 		int ret;
 
-		ret = nvme_auth_generate_key(dhchap_secret, &ctrl->host_key);
+		ret = nvme_auth_generate_key(dhchap_secret, &key);
 		if (ret)
 			return ret;
 		kfree(opts->dhchap_secret);
 		opts->dhchap_secret = dhchap_secret;
+		host_key = ctrl->host_key;
+		ctrl->host_key = key;
+		nvme_auth_free_key(host_key);
 		/* Key has changed; re-authentication with new key */
 		nvme_auth_reset(ctrl);
 	}
@@ -3796,13 +3800,17 @@ static ssize_t nvme_ctrl_dhchap_ctrl_secret_store(struct device *dev,
 	memcpy(dhchap_secret, buf, count);
 	nvme_auth_stop(ctrl);
 	if (strcmp(dhchap_secret, opts->dhchap_ctrl_secret)) {
+		struct nvme_dhchap_key *key, *ctrl_key;
 		int ret;
 
-		ret = nvme_auth_generate_key(dhchap_secret, &ctrl->ctrl_key);
+		ret = nvme_auth_generate_key(dhchap_secret, &key);
 		if (ret)
 			return ret;
 		kfree(opts->dhchap_ctrl_secret);
 		opts->dhchap_ctrl_secret = dhchap_secret;
+		ctrl_key = ctrl->ctrl_key;
+		ctrl->ctrl_key = key;
+		nvme_auth_free_key(ctrl_key);
 		/* Key has changed; re-authentication with new key */
 		nvme_auth_reset(ctrl);
 	}
-- 
2.34.1


[-- Attachment #15: 0005-nvme-auth-don-t-ignore-key-generation-failures-when-.patch --]
[-- Type: text/x-patch, Size: 2754 bytes --]

From 94d9c0183a247f797a8804c520337eb875e97601 Mon Sep 17 00:00:00 2001
From: Sagi Grimberg <sagi@grimberg.me>
Date: Mon, 24 Oct 2022 14:20:26 +0300
Subject: [PATCH 05/16] nvme-auth: don't ignore key generation failures when
 initializing ctrl keys

nvme_auth_generate_key can fail, don't ignore it upon initialization.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/auth.c | 19 +++++++++++++++----
 drivers/nvme/host/core.c |  6 +++++-
 drivers/nvme/host/nvme.h |  2 +-
 3 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
index 15cddc2bb14d..a6312a8846f8 100644
--- a/drivers/nvme/host/auth.c
+++ b/drivers/nvme/host/auth.c
@@ -972,15 +972,26 @@ static void nvme_dhchap_auth_work(struct work_struct *work)
 	 */
 }
 
-void nvme_auth_init_ctrl(struct nvme_ctrl *ctrl)
+int nvme_auth_init_ctrl(struct nvme_ctrl *ctrl)
 {
+	int ret;
+
 	INIT_LIST_HEAD(&ctrl->dhchap_auth_list);
 	INIT_WORK(&ctrl->dhchap_auth_work, nvme_dhchap_auth_work);
 	mutex_init(&ctrl->dhchap_auth_mutex);
 	if (!ctrl->opts)
-		return;
-	nvme_auth_generate_key(ctrl->opts->dhchap_secret, &ctrl->host_key);
-	nvme_auth_generate_key(ctrl->opts->dhchap_ctrl_secret, &ctrl->ctrl_key);
+		return 0;
+	ret = nvme_auth_generate_key(ctrl->opts->dhchap_secret,
+			&ctrl->host_key);
+	if (ret)
+		return ret;
+	ret = nvme_auth_generate_key(ctrl->opts->dhchap_ctrl_secret,
+			&ctrl->ctrl_key);
+	if (ret) {
+		nvme_auth_free_key(ctrl->host_key);
+		ctrl->host_key = NULL;
+	}
+	return ret;
 }
 EXPORT_SYMBOL_GPL(nvme_auth_init_ctrl);
 
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index dc4220600585..09f2b44fe771 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -5078,9 +5078,13 @@ int nvme_init_ctrl(struct nvme_ctrl *ctrl, struct device *dev,
 
 	nvme_fault_inject_init(&ctrl->fault_inject, dev_name(ctrl->device));
 	nvme_mpath_init_ctrl(ctrl);
-	nvme_auth_init_ctrl(ctrl);
+	ret = nvme_auth_init_ctrl(ctrl);
+	if (ret)
+		goto out_free_cdev;
 
 	return 0;
+out_free_cdev:
+	cdev_device_del(&ctrl->cdev, ctrl->device);
 out_free_name:
 	nvme_put_ctrl(ctrl);
 	kfree_const(ctrl->device->kobj.name);
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index a29877217ee6..cc314f45ddee 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -1019,7 +1019,7 @@ static inline bool nvme_ctrl_sgl_supported(struct nvme_ctrl *ctrl)
 }
 
 #ifdef CONFIG_NVME_AUTH
-void nvme_auth_init_ctrl(struct nvme_ctrl *ctrl);
+int nvme_auth_init_ctrl(struct nvme_ctrl *ctrl);
 void nvme_auth_stop(struct nvme_ctrl *ctrl);
 int nvme_auth_negotiate(struct nvme_ctrl *ctrl, int qid);
 int nvme_auth_wait(struct nvme_ctrl *ctrl, int qid);
-- 
2.34.1


[-- Attachment #16: 0004-nvme-auth-remove-redundant-buffer-deallocations.patch --]
[-- Type: text/x-patch, Size: 1019 bytes --]

From d0fab6588b03bf00e6ef90170952f9f56cc87a8e Mon Sep 17 00:00:00 2001
From: Sagi Grimberg <sagi@grimberg.me>
Date: Mon, 24 Oct 2022 13:49:07 +0300
Subject: [PATCH 04/16] nvme-auth: remove redundant buffer deallocations

host_response, host_key, ctrl_key and sess_key are
freed in nvme_auth_reset_dhchap which is called from
nvme_auth_free_dhchap.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/auth.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
index 93c0fc71bc7c..15cddc2bb14d 100644
--- a/drivers/nvme/host/auth.c
+++ b/drivers/nvme/host/auth.c
@@ -683,10 +683,6 @@ static void nvme_auth_free_dhchap(struct nvme_dhchap_queue_context *chap)
 		crypto_free_shash(chap->shash_tfm);
 	if (chap->dh_tfm)
 		crypto_free_kpp(chap->dh_tfm);
-	kfree_sensitive(chap->ctrl_key);
-	kfree_sensitive(chap->host_key);
-	kfree_sensitive(chap->sess_key);
-	kfree_sensitive(chap->host_response);
 	kfree(chap->buf);
 	kfree(chap);
 }
-- 
2.34.1


[-- Attachment #17: 0003-nvme-auth-don-t-re-authenticate-if-the-controller-is.patch --]
[-- Type: text/x-patch, Size: 975 bytes --]

From e54b5b3886b3343769487f7cdaffd1f3b42d2edb Mon Sep 17 00:00:00 2001
From: Sagi Grimberg <sagi@grimberg.me>
Date: Mon, 24 Oct 2022 13:46:02 +0300
Subject: [PATCH 03/16] nvme-auth: don't re-authenticate if the controller is
 not LIVE

The connect sequence will re-authenticate.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/auth.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
index 734928282d3e..93c0fc71bc7c 100644
--- a/drivers/nvme/host/auth.c
+++ b/drivers/nvme/host/auth.c
@@ -939,6 +939,13 @@ static void nvme_dhchap_auth_work(struct work_struct *work)
 		container_of(work, struct nvme_ctrl, dhchap_auth_work);
 	int ret, q;
 
+	/*
+	 * If the ctrl is no connected, bail as reconnect will handle
+	 * authentication.
+	 */
+	if (ctrl->state != NVME_CTRL_LIVE)
+		return;
+
 	/* Authenticate admin queue first */
 	ret = nvme_auth_negotiate(ctrl, 0);
 	if (ret) {
-- 
2.34.1


[-- Attachment #18: 0002-nvme-auth-remove-symbol-export-from-nvme_auth_reset.patch --]
[-- Type: text/x-patch, Size: 769 bytes --]

From 0e36b06563a00073e27dbb0d54569e07ad402343 Mon Sep 17 00:00:00 2001
From: Sagi Grimberg <sagi@grimberg.me>
Date: Mon, 24 Oct 2022 12:20:13 +0300
Subject: [PATCH 02/16] nvme-auth: remove symbol export from nvme_auth_reset

Only the nvme module calls it.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/auth.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
index d45333268fcf..734928282d3e 100644
--- a/drivers/nvme/host/auth.c
+++ b/drivers/nvme/host/auth.c
@@ -932,7 +932,6 @@ void nvme_auth_reset(struct nvme_ctrl *ctrl)
 	}
 	mutex_unlock(&ctrl->dhchap_auth_mutex);
 }
-EXPORT_SYMBOL_GPL(nvme_auth_reset);
 
 static void nvme_dhchap_auth_work(struct work_struct *work)
 {
-- 
2.34.1


[-- Attachment #19: 0001-nvme-auth-rename-__nvme_auth_-reset-free-to-nvme_aut.patch --]
[-- Type: text/x-patch, Size: 2468 bytes --]

From 4e1fddf9dd76be2ed1b4c00664f503c6e36ac70b Mon Sep 17 00:00:00 2001
From: Sagi Grimberg <sagi@grimberg.me>
Date: Sun, 23 Oct 2022 11:12:25 +0300
Subject: [PATCH 01/16] nvme-auth: rename __nvme_auth_[reset|free] to
 nvme_auth[reset|free]_dhchap

nvme_auth_[reset|free] operate on the controller while
__nvme_auth_[reset|free] operate on a chap struct (which maps to a queue
context). Rename it for clarity.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/auth.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
index c8a6db7c4498..d45333268fcf 100644
--- a/drivers/nvme/host/auth.c
+++ b/drivers/nvme/host/auth.c
@@ -654,7 +654,7 @@ static int nvme_auth_dhchap_exponential(struct nvme_ctrl *ctrl,
 	return 0;
 }
 
-static void __nvme_auth_reset(struct nvme_dhchap_queue_context *chap)
+static void nvme_auth_reset_dhchap(struct nvme_dhchap_queue_context *chap)
 {
 	kfree_sensitive(chap->host_response);
 	chap->host_response = NULL;
@@ -676,9 +676,9 @@ static void __nvme_auth_reset(struct nvme_dhchap_queue_context *chap)
 	memset(chap->c2, 0, sizeof(chap->c2));
 }
 
-static void __nvme_auth_free(struct nvme_dhchap_queue_context *chap)
+static void nvme_auth_free_dhchap(struct nvme_dhchap_queue_context *chap)
 {
-	__nvme_auth_reset(chap);
+	nvme_auth_reset_dhchap(chap);
 	if (chap->shash_tfm)
 		crypto_free_shash(chap->shash_tfm);
 	if (chap->dh_tfm)
@@ -868,7 +868,7 @@ int nvme_auth_negotiate(struct nvme_ctrl *ctrl, int qid)
 			dev_dbg(ctrl->device, "qid %d: re-using context\n", qid);
 			mutex_unlock(&ctrl->dhchap_auth_mutex);
 			flush_work(&chap->auth_work);
-			__nvme_auth_reset(chap);
+			nvme_auth_reset_dhchap(chap);
 			queue_work(nvme_wq, &chap->auth_work);
 			return 0;
 		}
@@ -928,7 +928,7 @@ void nvme_auth_reset(struct nvme_ctrl *ctrl)
 	list_for_each_entry(chap, &ctrl->dhchap_auth_list, entry) {
 		mutex_unlock(&ctrl->dhchap_auth_mutex);
 		flush_work(&chap->auth_work);
-		__nvme_auth_reset(chap);
+		nvme_auth_reset_dhchap(chap);
 	}
 	mutex_unlock(&ctrl->dhchap_auth_mutex);
 }
@@ -1002,7 +1002,7 @@ void nvme_auth_free(struct nvme_ctrl *ctrl)
 	list_for_each_entry_safe(chap, tmp, &ctrl->dhchap_auth_list, entry) {
 		list_del_init(&chap->entry);
 		flush_work(&chap->auth_work);
-		__nvme_auth_free(chap);
+		nvme_auth_free_dhchap(chap);
 	}
 	mutex_unlock(&ctrl->dhchap_auth_mutex);
 	if (ctrl->host_key) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH 08/16] nvme-auth: don't keep long lived 4k dhchap buffer
  2022-11-09  6:39   ` Christoph Hellwig
@ 2022-11-09 18:05     ` Sagi Grimberg
  0 siblings, 0 replies; 43+ messages in thread
From: Sagi Grimberg @ 2022-11-09 18:05 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-nvme, Hannes Reinecke, Keith Busch, Chaitanya Kulkarni


>> dhchap structure is per-queue, it is wasteful to keep it for the entire
>> lifetime of the queue. Allocate it dynamically and get rid of it after
>> authentication. We don't need kzalloc because all accessors are clearing
>> it before writing to it.
> 
> Btw, do we need a mempool here to make sure we always get one for
> a reconnect under memory pressure?

We probably need it. Will add as an incremental patch.

>> +	/*
>> +	 * Allocate a large enough buffer for the entire negotiation:
>> +	 * 4k should be enough to ffdhe8192.
>> +	 */
>> +	chap->buf_size = 4096;
> 
> Same comments as for Hannes previous series:  no need for ->buf_size,
> this can be a global constant.
> 
> And in the comment the should is weird, it is enough.

Folding it in.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 07/16] nvme-auth: remove redundant if statement
  2022-11-09  6:37   ` Christoph Hellwig
@ 2022-11-09 18:42     ` Sagi Grimberg
  0 siblings, 0 replies; 43+ messages in thread
From: Sagi Grimberg @ 2022-11-09 18:42 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-nvme, Hannes Reinecke, Keith Busch, Chaitanya Kulkarni


>> diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
>> index a6312a8846f8..07d497f49e4b 100644
>> --- a/drivers/nvme/host/auth.c
>> +++ b/drivers/nvme/host/auth.c
>> @@ -874,7 +874,7 @@ int nvme_auth_negotiate(struct nvme_ctrl *ctrl, int qid)
>>   		mutex_unlock(&ctrl->dhchap_auth_mutex);
>>   		return -ENOMEM;
>>   	}
>> -	chap->qid = (qid == NVME_QID_ANY) ? 0 : qid;
>> +	chap->qid = qid;
> 
> Btw, the same "transformation" can also be removed in nvme_auth_submit.

There if qid = 0 we want to pass NVME_QID_ANY because we don't want
__nvme_submit_sync_cmd to use blk_mq_alloc_request_hctx as we are
using the admin tagset.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 03/16] nvme-auth: don't re-authenticate if the controller is not LIVE
  2022-11-09  7:33   ` Hannes Reinecke
@ 2022-11-09 19:00     ` Sagi Grimberg
  0 siblings, 0 replies; 43+ messages in thread
From: Sagi Grimberg @ 2022-11-09 19:00 UTC (permalink / raw)
  To: Hannes Reinecke, linux-nvme
  Cc: Christoph Hellwig, Keith Busch, Chaitanya Kulkarni


>> The connect sequence will re-authenticate.
>>
>> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
>> ---
>>   drivers/nvme/host/auth.c | 7 +++++++
>>   1 file changed, 7 insertions(+)
>>
>> diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
>> index 734928282d3e..93c0fc71bc7c 100644
>> --- a/drivers/nvme/host/auth.c
>> +++ b/drivers/nvme/host/auth.c
>> @@ -939,6 +939,13 @@ static void nvme_dhchap_auth_work(struct 
>> work_struct *work)
>>           container_of(work, struct nvme_ctrl, dhchap_auth_work);
>>       int ret, q;
>> +    /*
>> +     * If the ctrl is no connected, bail as reconnect will handle
>> +     * authentication.
>> +     */
>> +    if (ctrl->state != NVME_CTRL_LIVE)
>> +        return;
>> +
>>       /* Authenticate admin queue first */
>>       ret = nvme_auth_negotiate(ctrl, 0);
>>       if (ret) {
> 
> What about a state check in __nvme_auth_work()? That should only be 
> called if the queue is not in CONNECTING or LIVE.

I don't know why that would be needed. Now that the ctrl level work
checks for LIVE and waits for queues works, and nvme_auth_stop flushes
it, resets should never compete with this work.

> 
> Otherwise:
> Reviewed-by: Hannes Reinecke <hare@suse.de>
> 
> Cheers,
> 
> Hannes


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 14/16] nvme-auth: have dhchap_auth_work wait for queues auth to complete
  2022-11-09  7:44   ` Hannes Reinecke
@ 2022-11-09 19:01     ` Sagi Grimberg
  0 siblings, 0 replies; 43+ messages in thread
From: Sagi Grimberg @ 2022-11-09 19:01 UTC (permalink / raw)
  To: Hannes Reinecke, linux-nvme
  Cc: Christoph Hellwig, Keith Busch, Chaitanya Kulkarni



On 11/9/22 09:44, Hannes Reinecke wrote:
> On 11/9/22 04:44, Sagi Grimberg wrote:
>> It triggered the queue authentication work elements in parallel, but
>> the ctrl authentication work itself completes when all of them
>> completes. Hence wait for queues auth completions.
>>
>> This also makes nvme_auth_stop simply a sync cancel of ctrl
>> dhchap_auth_work.
>>
>> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
>> ---
>>   drivers/nvme/host/auth.c | 11 ++++++-----
>>   1 file changed, 6 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
>> index 0159179c2455..e88d6254f2f5 100644
>> --- a/drivers/nvme/host/auth.c
>> +++ b/drivers/nvme/host/auth.c
>> @@ -924,6 +924,12 @@ static void nvme_dhchap_auth_work(struct 
>> work_struct *work)
>>        * Failure is a soft-state; credentials remain valid until
>>        * the controller terminates the connection.
>>        */
>> +    for (q = 1; q < ctrl->queue_count; q++) {
>> +        ret = nvme_auth_wait(ctrl, q);
>> +        if (ret)
>> +            dev_warn(ctrl->device,
>> +                 "qid %d: authentication failed\n", q);
>> +    }
>>   }
> Don't we need to clear the dhchap context here?

nvme_auth_wait clears it.

> 
>>   int nvme_auth_init_ctrl(struct nvme_ctrl *ctrl)
>> @@ -974,12 +980,7 @@ EXPORT_SYMBOL_GPL(nvme_auth_init_ctrl);
>>   void nvme_auth_stop(struct nvme_ctrl *ctrl)
>>   {
>> -    struct nvme_dhchap_queue_context *chap;
>> -    int i;
>> -
>>       cancel_work_sync(&ctrl->dhchap_auth_work);
>> -    nvme_foreach_dhchap(i, chap, ctrl)
>> -        cancel_work_sync(&chap->auth_work);
>>   }
>>   EXPORT_SYMBOL_GPL(nvme_auth_stop);
> Cheers,
> 
> Hannes


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 12/16] nvme-auth: convert dhchap_auth_list to an array
  2022-11-09 17:40   ` Hannes Reinecke
@ 2022-11-09 19:05     ` Sagi Grimberg
  0 siblings, 0 replies; 43+ messages in thread
From: Sagi Grimberg @ 2022-11-09 19:05 UTC (permalink / raw)
  To: Hannes Reinecke, linux-nvme
  Cc: Christoph Hellwig, Keith Busch, Chaitanya Kulkarni



On 11/9/22 19:40, Hannes Reinecke wrote:
> On 11/9/22 04:44, Sagi Grimberg wrote:
>> We know exactly how many dhchap contexts we will need, there is no need
>> to hold a list that we need to protect with a mutex. Convert to
>> a dynamically allocated array. And dhchap_context access state is
>> maintained by the chap itself.
>>
>> Make dhchap_auth_mutex protect only the ctrl host_key and ctrl_key
>> in a fine-graned lock such that there is no long lasting acuisition
>> of the lock.
>>
>> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
>> ---
>>   drivers/nvme/host/auth.c | 113 +++++++++++++++++++++------------------
>>   drivers/nvme/host/core.c |   4 ++
>>   drivers/nvme/host/nvme.h |   2 +-
>>   3 files changed, 65 insertions(+), 54 deletions(-)
>>
>> diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
>> index 23d486284da9..b78e0df54f6c 100644
>> --- a/drivers/nvme/host/auth.c
>> +++ b/drivers/nvme/host/auth.c
>> @@ -46,6 +46,14 @@ struct nvme_dhchap_queue_context {
>>       (qid == 0) ? 0 : BLK_MQ_REQ_NOWAIT | BLK_MQ_REQ_RESERVED
>>   #define nvme_auth_queue_from_qid(ctrl, qid) \
>>       (qid == 0) ? (ctrl)->fabrics_q : (ctrl)->connect_q
>> +#define nvme_auth_chap_from_qid(ctrl, qid) \
>> +    &((struct nvme_dhchap_queue_context*)(ctrl)->dhchap_ctxs)[qid]
>> +#define ctrl_max_dhchaps(ctrl) \
>> +    (ctrl)->opts->nr_io_queues + (ctrl)->opts->nr_write_queues + \
>> +            (ctrl)->opts->nr_poll_queues + 1
>> +#define nvme_foreach_dhchap(i, chap, ctrl) \
>> +    for (i = 0, chap = (ctrl)->dhchap_ctxs; \
>> +         i < ctrl_max_dhchaps(ctrl) && chap != NULL; i++, chap++)
>>   static int nvme_auth_submit(struct nvme_ctrl *ctrl, int qid,
>>                   void *data, size_t data_len, bool auth_send)
> 
> Please use an xarray (cf my patch doing the same).
> That leads to far better readable code, and avoid having to do weird casts.
> 
> Also, the claim 'we know exactly how many queues' isn't quite true, as 
> we only know the _maximum_ number of queues. The controller can (and 
> will) decrease this number based on the target configuration.

I don't want a dynamically allocated array, this pattern of first time
allocates a context and the rest of the time reuses is less readable and
causes interlocking.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 12/16] nvme-auth: convert dhchap_auth_list to an array
  2022-11-09  6:48   ` Christoph Hellwig
@ 2022-11-09 19:06     ` Sagi Grimberg
  0 siblings, 0 replies; 43+ messages in thread
From: Sagi Grimberg @ 2022-11-09 19:06 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-nvme, Hannes Reinecke, Keith Busch, Chaitanya Kulkarni


>> Make dhchap_auth_mutex protect only the ctrl host_key and ctrl_key
>> in a fine-graned lock such that there is no long lasting acuisition
>> of the lock.
> 
> Can you split that part into a separate patch?  Right now the patch
> is a bit confusing because it does multiple things.

I couldn't find a way to split it in a way that won't have holes in
it. I'll try to look into this again.

>> +#define nvme_auth_chap_from_qid(ctrl, qid) \
>> +	&((struct nvme_dhchap_queue_context*)(ctrl)->dhchap_ctxs)[qid]
> 
> Please properly type dhchap_ctxs instead of the ugly cast here.  That
> also removed the need for the macro.
> functions instead of macros.

If I type it, nvme_dhchap_queue_context definition needs to move to
nvme.h, is that fine with everyone?

>> +#define ctrl_max_dhchaps(ctrl) \
>> +	(ctrl)->opts->nr_io_queues + (ctrl)->opts->nr_write_queues + \
>> +			(ctrl)->opts->nr_poll_queues + 1
> 
> This overallocates in case the controller supports less queues than
> requested, but I guess that's not a big problem in practice.

It isn't, we do the same with queues.

> Also please make this an inline function to add type safety and to make
> it more readable.

OK.

> 
>> +#define nvme_foreach_dhchap(i, chap, ctrl) \
>> +	for (i = 0, chap = (ctrl)->dhchap_ctxs; \
>> +	     i < ctrl_max_dhchaps(ctrl) && chap != NULL; i++, chap++)
> 
> I find this macro a bit obsfucating.  There's only three callers anyway,
> and they only use chap 1, 2 or three times respectively.
> 
> So I'd just loop:
> 
> 	for (i = 0, i < ctrl_max_dhchaps(ctrl); i++)
> 
> and then do the dereference of >dhchap_ctxs in the loop.

I take it that you don't like the foreach iterator...

>>   			    void *data, size_t data_len, bool auth_send)
>> @@ -330,7 +338,7 @@ static int nvme_auth_process_dhchap_success1(struct nvme_ctrl *ctrl,
>>   	struct nvmf_auth_dhchap_success1_data *data = chap->buf;
>>   	size_t size = sizeof(*data);
>>   
>> -	if (ctrl->ctrl_key)
>> +	if (chap->ctrl_key)
> 
> How is this related to the rest of the patch?

Here the code is dereferencing ctrl->ctrl_key which needs to synchronize
with sysfs driven modifications, but it has a queue chap context as 
well, which does not need a lock. Hence I changed the condition.

>> +		INIT_WORK(&chap->auth_work, __nvme_auth_work);
> 
> Btw, this really should lose the __ prefix in another prep patch.

Yes.


^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2022-11-09 19:06 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-09  3:44 [PATCH 00/16] nvme: rework dhchap authentication host code Sagi Grimberg
2022-11-09  3:44 ` [PATCH 01/16] nvme-auth: rename __nvme_auth_[reset|free] to nvme_auth[reset|free]_dhchap Sagi Grimberg
2022-11-09 17:35   ` Hannes Reinecke
2022-11-09  3:44 ` [PATCH 02/16] nvme-auth: remove symbol export from nvme_auth_reset Sagi Grimberg
2022-11-09  7:30   ` Hannes Reinecke
2022-11-09  3:44 ` [PATCH 03/16] nvme-auth: don't re-authenticate if the controller is not LIVE Sagi Grimberg
2022-11-09  7:33   ` Hannes Reinecke
2022-11-09 19:00     ` Sagi Grimberg
2022-11-09  3:44 ` [PATCH 04/16] nvme-auth: remove redundant buffer deallocations Sagi Grimberg
2022-11-09 17:36   ` Hannes Reinecke
2022-11-09  3:44 ` [PATCH 05/16] nvme-auth: don't ignore key generation failures when initializing ctrl keys Sagi Grimberg
2022-11-09  7:33   ` Hannes Reinecke
2022-11-09  3:44 ` [PATCH 06/16] nvme-auth: don't override ctrl keys before validation Sagi Grimberg
2022-11-09  7:34   ` Hannes Reinecke
2022-11-09  3:44 ` [PATCH 07/16] nvme-auth: remove redundant if statement Sagi Grimberg
2022-11-09  6:37   ` Christoph Hellwig
2022-11-09 18:42     ` Sagi Grimberg
2022-11-09  7:34   ` Hannes Reinecke
2022-11-09  3:44 ` [PATCH 08/16] nvme-auth: don't keep long lived 4k dhchap buffer Sagi Grimberg
2022-11-09  6:39   ` Christoph Hellwig
2022-11-09 18:05     ` Sagi Grimberg
2022-11-09 17:37   ` Hannes Reinecke
2022-11-09  3:44 ` [PATCH 09/16] nvme-auth: clear sensitive info right after authentication completes Sagi Grimberg
2022-11-09  7:35   ` Hannes Reinecke
2022-11-09  3:44 ` [PATCH 10/16] nvme-auth: remove redundant deallocations Sagi Grimberg
2022-11-09  3:44 ` [PATCH 11/16] nvme-auth: no need to reset chap contexts on re-authentication Sagi Grimberg
2022-11-09  7:36   ` Hannes Reinecke
2022-11-09  3:44 ` [PATCH 12/16] nvme-auth: convert dhchap_auth_list to an array Sagi Grimberg
2022-11-09  6:48   ` Christoph Hellwig
2022-11-09 19:06     ` Sagi Grimberg
2022-11-09 17:40   ` Hannes Reinecke
2022-11-09 19:05     ` Sagi Grimberg
2022-11-09  3:44 ` [PATCH 13/16] nvme-auth: remove redundant auth_work flush Sagi Grimberg
2022-11-09 17:41   ` Hannes Reinecke
2022-11-09  3:44 ` [PATCH 14/16] nvme-auth: have dhchap_auth_work wait for queues auth to complete Sagi Grimberg
2022-11-09  7:44   ` Hannes Reinecke
2022-11-09 19:01     ` Sagi Grimberg
2022-11-09  3:44 ` [PATCH 15/16] nvme-tcp: stop auth work after tearing down queues in error recovery Sagi Grimberg
2022-11-09  3:44 ` [PATCH 16/16] nvme-rdma: " Sagi Grimberg
2022-11-09  3:44 ` [PATCH 17/16 blktests] nvme: add re-authentication running concurrently with reset Sagi Grimberg
2022-11-09  3:47 ` [PATCH 00/16] nvme: rework dhchap authentication host code Sagi Grimberg
2022-11-09  7:45 ` Hannes Reinecke
2022-11-09 17:42   ` Sagi Grimberg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.