All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv5 00/14] nvme: In-kernel TLS support for TCP
@ 2023-08-03 10:50 Hannes Reinecke
  2023-08-03 10:50 ` [PATCH 01/14] nvme-keyring: register '.nvme' keyring Hannes Reinecke
                   ` (13 more replies)
  0 siblings, 14 replies; 45+ messages in thread
From: Hannes Reinecke @ 2023-08-03 10:50 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Sagi Grimberg, Keith Busch, linux-nvme, Hannes Reinecke

Hi all,

with the merge of Chuck Levers handshake upcall mechanism and
my tls_read_sock() implementation finally merged with net-next
it's now time to restart on the actual issue, implementing
in-kernel TLS support for nvme-tcp.

The patchset is based on the recent net-next git tree;
everything after commit 05191d8896b4 ("Merge branch
'in-kernel-support-for-the-tls-alert-protocol'") should work.
Just a tiny wee snag has been discovered in the ->read_sock()
implementation, so please ensure to have the patch
'net/tls: avoid TCP window full during ->read_sock()'
merged, too.

It also requires the 'tlshd' userspace daemon
(https://github.com/oracle/ktls-utils)
for the actual TLS handshake.
Changes for nvme-cli are already included in the upstream repository.

Theory of operation:
A dedicated '.nvme' keyring is created to hold the pre-shared keys (PSKs)
for the TLS handshake. Keys will have to be provisioned before TLS handshake
is attempted; that can be done with the 'nvme gen-tls-key' command for nvme-cli
(patches are already merged upstream).
After connection to the remote TCP port the client side will use the
'best' PSK (as inferred from the NVMe TCP spec) or the PSK specified
by the '--tls_key' option to nvme-cli and call the TLS userspace daemon
to initiate a TLS handshake.
The server side will then invoke the TLS userspace daemon to run the TLS
handshake.
If the TLS handshake succeeds the userspace daemon will be activating
kTLS on the socket, and control is passed back to the kernel.

Patchset can be found at:
git.kernel.org/pub/scm/linux/kernel/git/hare/nvme.git
branch tls.v8

For testing I'm using this script, running on a nvme target
with NQN 'nqn.test' and using 127.0.0.1 as a port:

modprobe nvmet-tcp
nvmetcli restore
modprobe nvme-tcp
./nvme gen-tls-key --subsysnqn=nqn.test -i
./nvme gen-tls-key --subsysnqn=nqn.2014-08.org.nvmexpress.discovery -i
tlshd -c /etc/tlshd.conf

and then one can do a simple:
# nvme connect -t tcp -a 127.0.0.1 -s 4420 -n nqn.test --tls
to start the connection.

As usual, comments and reviews are welcome.

Changes to v4:
- Split off network patches into a separate
  patchset
- Handle TLS Alert notifications

Changes to v3:
- Really handle MSG_EOR for TLS
- Fixup MSG_SENDPAGE_NOTLAST handling
- Conditionally disable fabric option

Changes to v2:
- Included reviews from Sagi
- Removed MSG_SENDPAGE_NOTLAST
- Improved MSG_EOR handling for TLS
- Add config options NVME_TCP_TLS
  and NVME_TARGET_TCP_TLS

Changes to the original RFC:
- Add a CONFIG_NVME_TLS config option
- Use a single PSK for the TLS handshake
- Make TLS connections mandatory
- Do not peek messages for the server
- Simplify data_ready callback
- Implement read_sock() for TLS

Hannes Reinecke (14):
  nvme-keyring: register '.nvme' keyring
  nvme-keyring: define a 'psk' keytype
  nvme: add TCP TSAS definitions
  nvme-tcp: add definitions for TLS cipher suites
  nvme-keyring: implement nvme_tls_psk_default()
  security/keys: export key_lookup()
  nvme/tcp: allocate socket file
  nvme-tcp: enable TLS handshake upcall
  nvme-tcp: control message handling for recvmsg()
  nvme-fabrics: parse options 'keyring' and 'tls_key'
  nvmet: make TCP sectype settable via configfs
  nvmet-tcp: allocate socket file
  nvmet-tcp: enable TLS handshake upcall
  nvmet-tcp: control messages for recvmsg()

 drivers/nvme/common/Kconfig    |   4 +
 drivers/nvme/common/Makefile   |   3 +-
 drivers/nvme/common/keyring.c  | 182 ++++++++++++++++++++++++
 drivers/nvme/host/Kconfig      |  14 ++
 drivers/nvme/host/core.c       |  12 +-
 drivers/nvme/host/fabrics.c    |  81 ++++++++++-
 drivers/nvme/host/fabrics.h    |   9 ++
 drivers/nvme/host/nvme.h       |   1 +
 drivers/nvme/host/sysfs.c      |  20 +++
 drivers/nvme/host/tcp.c        | 167 ++++++++++++++++++++--
 drivers/nvme/target/Kconfig    |  14 ++
 drivers/nvme/target/configfs.c | 128 ++++++++++++++++-
 drivers/nvme/target/nvmet.h    |   1 +
 drivers/nvme/target/tcp.c      | 249 +++++++++++++++++++++++++++++----
 include/linux/nvme-keyring.h   |  36 +++++
 include/linux/nvme-tcp.h       |   6 +
 include/linux/nvme.h           |  10 ++
 security/keys/key.c            |   1 +
 18 files changed, 891 insertions(+), 47 deletions(-)
 create mode 100644 drivers/nvme/common/keyring.c
 create mode 100644 include/linux/nvme-keyring.h

-- 
2.35.3



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 01/14] nvme-keyring: register '.nvme' keyring
  2023-08-03 10:50 [PATCHv5 00/14] nvme: In-kernel TLS support for TCP Hannes Reinecke
@ 2023-08-03 10:50 ` Hannes Reinecke
  2023-08-07  7:09   ` Sagi Grimberg
  2023-08-03 10:50 ` [PATCH 02/14] nvme-keyring: define a 'psk' keytype Hannes Reinecke
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 45+ messages in thread
From: Hannes Reinecke @ 2023-08-03 10:50 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Sagi Grimberg, Keith Busch, linux-nvme, Hannes Reinecke

Register a '.nvme' keyring to hold keys for TLS and DH-HMAC-CHAP and
add a new config option NVME_KEYRING.
We need a separate keyring for NVMe as the configuration is done
via individual commands (eg for configfs), and the usual per-session
or per-process keyrings can't be used.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/nvme/common/Kconfig   |  4 ++++
 drivers/nvme/common/Makefile  |  3 ++-
 drivers/nvme/common/keyring.c | 40 +++++++++++++++++++++++++++++++++++
 drivers/nvme/host/core.c      | 10 +++++++--
 include/linux/nvme-keyring.h  | 28 ++++++++++++++++++++++++
 5 files changed, 82 insertions(+), 3 deletions(-)
 create mode 100644 drivers/nvme/common/keyring.c
 create mode 100644 include/linux/nvme-keyring.h

diff --git a/drivers/nvme/common/Kconfig b/drivers/nvme/common/Kconfig
index 4514f44362dd..641b27adb047 100644
--- a/drivers/nvme/common/Kconfig
+++ b/drivers/nvme/common/Kconfig
@@ -2,3 +2,7 @@
 
 config NVME_COMMON
        tristate
+
+config NVME_KEYRING
+       bool
+       select KEYS
diff --git a/drivers/nvme/common/Makefile b/drivers/nvme/common/Makefile
index 720c625b8a52..0cbd0b0b8d49 100644
--- a/drivers/nvme/common/Makefile
+++ b/drivers/nvme/common/Makefile
@@ -4,4 +4,5 @@ ccflags-y			+= -I$(src)
 
 obj-$(CONFIG_NVME_COMMON)	+= nvme-common.o
 
-nvme-common-y			+= auth.o
+nvme-common-$(CONFIG_NVME_AUTH)	+= auth.o
+nvme-common-$(CONFIG_NVME_KEYRING) += keyring.o
diff --git a/drivers/nvme/common/keyring.c b/drivers/nvme/common/keyring.c
new file mode 100644
index 000000000000..5cf64b278119
--- /dev/null
+++ b/drivers/nvme/common/keyring.c
@@ -0,0 +1,40 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2023 Hannes Reinecke, SUSE Labs
+ */
+
+#include <linux/module.h>
+#include <linux/seq_file.h>
+#include <linux/key-type.h>
+#include <keys/user-type.h>
+#include <linux/nvme.h>
+
+static struct key *nvme_keyring;
+
+key_serial_t nvme_keyring_id(void)
+{
+	return nvme_keyring->serial;
+}
+EXPORT_SYMBOL_GPL(nvme_keyring_id);
+
+int nvme_keyring_init(void)
+{
+	nvme_keyring = keyring_alloc(".nvme",
+				     GLOBAL_ROOT_UID, GLOBAL_ROOT_GID,
+				     current_cred(),
+				     (KEY_POS_ALL & ~KEY_POS_SETATTR) |
+				     (KEY_USR_ALL & ~KEY_USR_SETATTR),
+				     KEY_ALLOC_NOT_IN_QUOTA, NULL, NULL);
+	if (IS_ERR(nvme_keyring))
+		return PTR_ERR(nvme_keyring);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(nvme_keyring_init);
+
+void nvme_keyring_exit(void)
+{
+	key_revoke(nvme_keyring);
+	key_put(nvme_keyring);
+}
+EXPORT_SYMBOL_GPL(nvme_keyring_exit);
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 37b6fa746662..dfc574d0f18d 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -25,6 +25,7 @@
 #include "nvme.h"
 #include "fabrics.h"
 #include <linux/nvme-auth.h>
+#include <linux/nvme-keyring.h>
 
 #define CREATE_TRACE_POINTS
 #include "trace.h"
@@ -4703,12 +4704,16 @@ static int __init nvme_core_init(void)
 		result = PTR_ERR(nvme_ns_chr_class);
 		goto unregister_generic_ns;
 	}
-
-	result = nvme_init_auth();
+	result = nvme_keyring_init();
 	if (result)
 		goto destroy_ns_chr;
+	result = nvme_init_auth();
+	if (result)
+		goto keyring_exit;
 	return 0;
 
+keyring_exit:
+	nvme_keyring_exit();
 destroy_ns_chr:
 	class_destroy(nvme_ns_chr_class);
 unregister_generic_ns:
@@ -4732,6 +4737,7 @@ static int __init nvme_core_init(void)
 static void __exit nvme_core_exit(void)
 {
 	nvme_exit_auth();
+	nvme_keyring_exit();
 	class_destroy(nvme_ns_chr_class);
 	class_destroy(nvme_subsys_class);
 	class_destroy(nvme_class);
diff --git a/include/linux/nvme-keyring.h b/include/linux/nvme-keyring.h
new file mode 100644
index 000000000000..32bd264a71e6
--- /dev/null
+++ b/include/linux/nvme-keyring.h
@@ -0,0 +1,28 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2023 Hannes Reinecke, SUSE Labs
+ */
+
+#ifndef _NVME_KEYRING_H
+#define _NVME_KEYRING_H
+
+#ifdef CONFIG_NVME_KEYRING
+
+key_serial_t nvme_keyring_id(void);
+int nvme_keyring_init(void);
+void nvme_keyring_exit(void);
+
+#else
+
+static inline key_serial_t nvme_keyring_id(void)
+{
+	return 0;
+}
+static inline int nvme_keyring_init(void)
+{
+	return 0;
+}
+static inline void nvme_keyring_exit(void) {}
+
+#endif /* !CONFIG_NVME_KEYRING */
+#endif /* _NVME_KEYRING_H */
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 02/14] nvme-keyring: define a 'psk' keytype
  2023-08-03 10:50 [PATCHv5 00/14] nvme: In-kernel TLS support for TCP Hannes Reinecke
  2023-08-03 10:50 ` [PATCH 01/14] nvme-keyring: register '.nvme' keyring Hannes Reinecke
@ 2023-08-03 10:50 ` Hannes Reinecke
  2023-08-07  7:11   ` Sagi Grimberg
  2023-08-03 10:50 ` [PATCH 03/14] nvme: add TCP TSAS definitions Hannes Reinecke
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 45+ messages in thread
From: Hannes Reinecke @ 2023-08-03 10:50 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Sagi Grimberg, Keith Busch, linux-nvme, Hannes Reinecke

Define a 'psk' keytype to hold the NVMe TLS PSKs.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/nvme/common/keyring.c | 94 +++++++++++++++++++++++++++++++++++
 1 file changed, 94 insertions(+)

diff --git a/drivers/nvme/common/keyring.c b/drivers/nvme/common/keyring.c
index 5cf64b278119..494dd365052e 100644
--- a/drivers/nvme/common/keyring.c
+++ b/drivers/nvme/common/keyring.c
@@ -8,6 +8,8 @@
 #include <linux/key-type.h>
 #include <keys/user-type.h>
 #include <linux/nvme.h>
+#include <linux/nvme-tcp.h>
+#include <linux/nvme-keyring.h>
 
 static struct key *nvme_keyring;
 
@@ -17,8 +19,94 @@ key_serial_t nvme_keyring_id(void)
 }
 EXPORT_SYMBOL_GPL(nvme_keyring_id);
 
+static void nvme_tls_psk_describe(const struct key *key, struct seq_file *m)
+{
+	seq_puts(m, key->description);
+	seq_printf(m, ": %u", key->datalen);
+}
+
+static bool nvme_tls_psk_match(const struct key *key,
+			       const struct key_match_data *match_data)
+{
+	const char *match_id;
+	size_t match_len;
+
+	if (!key->description) {
+		pr_debug("%s: no key description\n", __func__);
+		return false;
+	}
+	match_len = strlen(key->description);
+	pr_debug("%s: id %s len %zd\n", __func__, key->description, match_len);
+
+	if (!match_data->raw_data) {
+		pr_debug("%s: no match data\n", __func__);
+		return false;
+	}
+	match_id = match_data->raw_data;
+	pr_debug("%s: match '%s' '%s' len %zd\n",
+		 __func__, match_id, key->description, match_len);
+	return !memcmp(key->description, match_id, match_len);
+}
+
+static int nvme_tls_psk_match_preparse(struct key_match_data *match_data)
+{
+	match_data->lookup_type = KEYRING_SEARCH_LOOKUP_ITERATE;
+	match_data->cmp = nvme_tls_psk_match;
+	return 0;
+}
+
+static struct key_type nvme_tls_psk_key_type = {
+	.name           = "psk",
+	.flags          = KEY_TYPE_NET_DOMAIN,
+	.preparse       = user_preparse,
+	.free_preparse  = user_free_preparse,
+	.match_preparse = nvme_tls_psk_match_preparse,
+	.instantiate    = generic_key_instantiate,
+	.revoke         = user_revoke,
+	.destroy        = user_destroy,
+	.describe       = nvme_tls_psk_describe,
+	.read           = user_read,
+};
+
+static struct key *nvme_tls_psk_lookup(struct key *keyring,
+		const char *hostnqn, const char *subnqn,
+		int hmac, bool generated)
+{
+	char *identity;
+	size_t identity_len = (NVMF_NQN_SIZE) * 2 + 11;
+	key_ref_t keyref;
+	key_serial_t keyring_id;
+
+	identity = kzalloc(identity_len, GFP_KERNEL);
+	if (!identity)
+		return ERR_PTR(-ENOMEM);
+
+	snprintf(identity, identity_len, "NVMe0%c%02d %s %s",
+		 generated ? 'G' : 'R', hmac, hostnqn, subnqn);
+
+	if (!keyring)
+		keyring = nvme_keyring;
+	keyring_id = key_serial(keyring);
+	pr_debug("keyring %x lookup tls psk '%s'\n",
+		 keyring_id, identity);
+	keyref = keyring_search(make_key_ref(keyring, true),
+				&nvme_tls_psk_key_type,
+				identity, false);
+	if (IS_ERR(keyref)) {
+		pr_debug("lookup tls psk '%s' failed, error %ld\n",
+			 identity, PTR_ERR(keyref));
+		kfree(identity);
+		return ERR_PTR(-ENOKEY);
+	}
+	kfree(identity);
+
+	return key_ref_to_ptr(keyref);
+}
+
 int nvme_keyring_init(void)
 {
+	int err;
+
 	nvme_keyring = keyring_alloc(".nvme",
 				     GLOBAL_ROOT_UID, GLOBAL_ROOT_GID,
 				     current_cred(),
@@ -28,12 +116,18 @@ int nvme_keyring_init(void)
 	if (IS_ERR(nvme_keyring))
 		return PTR_ERR(nvme_keyring);
 
+	err = register_key_type(&nvme_tls_psk_key_type);
+	if (err) {
+		key_put(nvme_keyring);
+		return err;
+	}
 	return 0;
 }
 EXPORT_SYMBOL_GPL(nvme_keyring_init);
 
 void nvme_keyring_exit(void)
 {
+	unregister_key_type(&nvme_tls_psk_key_type);
 	key_revoke(nvme_keyring);
 	key_put(nvme_keyring);
 }
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 03/14] nvme: add TCP TSAS definitions
  2023-08-03 10:50 [PATCHv5 00/14] nvme: In-kernel TLS support for TCP Hannes Reinecke
  2023-08-03 10:50 ` [PATCH 01/14] nvme-keyring: register '.nvme' keyring Hannes Reinecke
  2023-08-03 10:50 ` [PATCH 02/14] nvme-keyring: define a 'psk' keytype Hannes Reinecke
@ 2023-08-03 10:50 ` Hannes Reinecke
  2023-08-03 10:50 ` [PATCH 04/14] nvme-tcp: add definitions for TLS cipher suites Hannes Reinecke
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 45+ messages in thread
From: Hannes Reinecke @ 2023-08-03 10:50 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Sagi Grimberg, Keith Busch, linux-nvme, Hannes Reinecke

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
---
 include/linux/nvme.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/include/linux/nvme.h b/include/linux/nvme.h
index 26dd3f859d9d..a7ba74babad7 100644
--- a/include/linux/nvme.h
+++ b/include/linux/nvme.h
@@ -108,6 +108,13 @@ enum {
 	NVMF_RDMA_CMS_RDMA_CM	= 1, /* Sockets based endpoint addressing */
 };
 
+/* TSAS SECTYPE for TCP transport */
+enum {
+	NVMF_TCP_SECTYPE_NONE = 0, /* No Security */
+	NVMF_TCP_SECTYPE_TLS12 = 1, /* TLSv1.2, NVMe-oF 1.1 and NVMe-TCP 3.6.1.1 */
+	NVMF_TCP_SECTYPE_TLS13 = 2, /* TLSv1.3, NVMe-oF 1.1 and NVMe-TCP 3.6.1.1 */
+};
+
 #define NVME_AQ_DEPTH		32
 #define NVME_NR_AEN_COMMANDS	1
 #define NVME_AQ_BLK_MQ_DEPTH	(NVME_AQ_DEPTH - NVME_NR_AEN_COMMANDS)
@@ -1493,6 +1500,9 @@ struct nvmf_disc_rsp_page_entry {
 			__u16	pkey;
 			__u8	resv10[246];
 		} rdma;
+		struct tcp {
+			__u8	sectype;
+		} tcp;
 	} tsas;
 };
 
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 04/14] nvme-tcp: add definitions for TLS cipher suites
  2023-08-03 10:50 [PATCHv5 00/14] nvme: In-kernel TLS support for TCP Hannes Reinecke
                   ` (2 preceding siblings ...)
  2023-08-03 10:50 ` [PATCH 03/14] nvme: add TCP TSAS definitions Hannes Reinecke
@ 2023-08-03 10:50 ` Hannes Reinecke
  2023-08-03 10:50 ` [PATCH 05/14] nvme-keyring: implement nvme_tls_psk_default() Hannes Reinecke
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 45+ messages in thread
From: Hannes Reinecke @ 2023-08-03 10:50 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Sagi Grimberg, Keith Busch, linux-nvme, Hannes Reinecke

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
---
 include/linux/nvme-tcp.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/include/linux/nvme-tcp.h b/include/linux/nvme-tcp.h
index 57ebe1267f7f..e07e8978d691 100644
--- a/include/linux/nvme-tcp.h
+++ b/include/linux/nvme-tcp.h
@@ -18,6 +18,12 @@ enum nvme_tcp_pfv {
 	NVME_TCP_PFV_1_0 = 0x0,
 };
 
+enum nvme_tcp_tls_cipher {
+	NVME_TCP_TLS_CIPHER_INVALID     = 0,
+	NVME_TCP_TLS_CIPHER_SHA256      = 1,
+	NVME_TCP_TLS_CIPHER_SHA384      = 2,
+};
+
 enum nvme_tcp_fatal_error_status {
 	NVME_TCP_FES_INVALID_PDU_HDR		= 0x01,
 	NVME_TCP_FES_PDU_SEQ_ERR		= 0x02,
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 05/14] nvme-keyring: implement nvme_tls_psk_default()
  2023-08-03 10:50 [PATCHv5 00/14] nvme: In-kernel TLS support for TCP Hannes Reinecke
                   ` (3 preceding siblings ...)
  2023-08-03 10:50 ` [PATCH 04/14] nvme-tcp: add definitions for TLS cipher suites Hannes Reinecke
@ 2023-08-03 10:50 ` Hannes Reinecke
  2023-08-07  7:13   ` Sagi Grimberg
  2023-08-03 10:50 ` [PATCH 06/14] security/keys: export key_lookup() Hannes Reinecke
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 45+ messages in thread
From: Hannes Reinecke @ 2023-08-03 10:50 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Sagi Grimberg, Keith Busch, linux-nvme, Hannes Reinecke

Implement a function to select the preferred PSK for TLS.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/nvme/common/keyring.c | 48 +++++++++++++++++++++++++++++++++++
 include/linux/nvme-keyring.h  |  8 ++++++
 2 files changed, 56 insertions(+)

diff --git a/drivers/nvme/common/keyring.c b/drivers/nvme/common/keyring.c
index 494dd365052e..f8d9a208397b 100644
--- a/drivers/nvme/common/keyring.c
+++ b/drivers/nvme/common/keyring.c
@@ -5,6 +5,7 @@
 
 #include <linux/module.h>
 #include <linux/seq_file.h>
+#include <linux/key.h>
 #include <linux/key-type.h>
 #include <keys/user-type.h>
 #include <linux/nvme.h>
@@ -103,6 +104,53 @@ static struct key *nvme_tls_psk_lookup(struct key *keyring,
 	return key_ref_to_ptr(keyref);
 }
 
+/*
+ * NVMe PSK priority list
+ *
+ * 'Retained' PSKs (ie 'generated == false')
+ * should be preferred to 'generated' PSKs,
+ * and SHA-384 should be preferred to SHA-256.
+ */
+struct nvme_tls_psk_priority_list {
+	bool generated;
+	enum nvme_tcp_tls_cipher cipher;
+} nvme_tls_psk_prio[] = {
+	{ .generated = false,
+	  .cipher = NVME_TCP_TLS_CIPHER_SHA384, },
+	{ .generated = false,
+	  .cipher = NVME_TCP_TLS_CIPHER_SHA256, },
+	{ .generated = true,
+	  .cipher = NVME_TCP_TLS_CIPHER_SHA384, },
+	{ .generated = true,
+	  .cipher = NVME_TCP_TLS_CIPHER_SHA256, },
+};
+
+/*
+ * nvme_tls_psk_default - Return the preferred PSK to use for TLS ClientHello
+ */
+key_serial_t nvme_tls_psk_default(struct key *keyring,
+		      const char *hostnqn, const char *subnqn)
+{
+	struct key *tls_key;
+	key_serial_t tls_key_id;
+	int prio;
+
+	for (prio = 0; prio < ARRAY_SIZE(nvme_tls_psk_prio); prio++) {
+		bool generated = nvme_tls_psk_prio[prio].generated;
+		enum nvme_tcp_tls_cipher cipher = nvme_tls_psk_prio[prio].cipher;
+
+		tls_key = nvme_tls_psk_lookup(keyring, hostnqn, subnqn,
+					      cipher, generated);
+		if (!IS_ERR(tls_key)) {
+			tls_key_id = tls_key->serial;
+			key_put(tls_key);
+			return tls_key_id;
+		}
+	}
+	return 0;
+}
+EXPORT_SYMBOL_GPL(nvme_tls_psk_default);
+
 int nvme_keyring_init(void)
 {
 	int err;
diff --git a/include/linux/nvme-keyring.h b/include/linux/nvme-keyring.h
index 32bd264a71e6..4efea9dd967c 100644
--- a/include/linux/nvme-keyring.h
+++ b/include/linux/nvme-keyring.h
@@ -8,12 +8,20 @@
 
 #ifdef CONFIG_NVME_KEYRING
 
+key_serial_t nvme_tls_psk_default(struct key *keyring,
+		const char *hostnqn, const char *subnqn);
+
 key_serial_t nvme_keyring_id(void);
 int nvme_keyring_init(void);
 void nvme_keyring_exit(void);
 
 #else
 
+static inline key_serial_t nvme_tls_psk_default(struct key *keyring,
+		const char *hostnqn, const char *subnqn)
+{
+	return 0;
+}
 static inline key_serial_t nvme_keyring_id(void)
 {
 	return 0;
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 06/14] security/keys: export key_lookup()
  2023-08-03 10:50 [PATCHv5 00/14] nvme: In-kernel TLS support for TCP Hannes Reinecke
                   ` (4 preceding siblings ...)
  2023-08-03 10:50 ` [PATCH 05/14] nvme-keyring: implement nvme_tls_psk_default() Hannes Reinecke
@ 2023-08-03 10:50 ` Hannes Reinecke
  2023-08-07  7:13   ` Sagi Grimberg
  2023-08-03 10:50 ` [PATCH 07/14] nvme/tcp: allocate socket file Hannes Reinecke
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 45+ messages in thread
From: Hannes Reinecke @ 2023-08-03 10:50 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Sagi Grimberg, Keith Busch, linux-nvme, Hannes Reinecke, David Howells

For in-kernel consumers one cannot readily assign a user (eg when
running from a workqueue), so the normal key search permissions
cannot be applied.
This patch exports the 'key_lookup()' function for a simple lookup
of keys without checking for permissions.

Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 security/keys/key.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/security/keys/key.c b/security/keys/key.c
index 5c0c7df833f8..bd1b7d45df90 100644
--- a/security/keys/key.c
+++ b/security/keys/key.c
@@ -693,6 +693,7 @@ struct key *key_lookup(key_serial_t id)
 	spin_unlock(&key_serial_lock);
 	return key;
 }
+EXPORT_SYMBOL_GPL(key_lookup);
 
 /*
  * Find and lock the specified key type against removal.
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 07/14] nvme/tcp: allocate socket file
  2023-08-03 10:50 [PATCHv5 00/14] nvme: In-kernel TLS support for TCP Hannes Reinecke
                   ` (5 preceding siblings ...)
  2023-08-03 10:50 ` [PATCH 06/14] security/keys: export key_lookup() Hannes Reinecke
@ 2023-08-03 10:50 ` Hannes Reinecke
  2023-08-07  7:15   ` Sagi Grimberg
  2023-08-03 10:50 ` [PATCH 08/14] nvme-tcp: enable TLS handshake upcall Hannes Reinecke
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 45+ messages in thread
From: Hannes Reinecke @ 2023-08-03 10:50 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Sagi Grimberg, Keith Busch, linux-nvme, Hannes Reinecke

When using the TLS upcall we need to allocate a socket file such
that the userspace daemon is able to use the socket.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/nvme/host/tcp.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 9ce417cd32a7..c3b3b938c0f9 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1338,7 +1338,9 @@ static void nvme_tcp_free_queue(struct nvme_ctrl *nctrl, int qid)
 	}
 
 	noreclaim_flag = memalloc_noreclaim_save();
-	sock_release(queue->sock);
+	/* ->sock will be released by fput() */
+	fput(queue->sock->file);
+	queue->sock = NULL;
 	memalloc_noreclaim_restore(noreclaim_flag);
 
 	kfree(queue->pdu);
@@ -1512,6 +1514,7 @@ static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, int qid)
 	struct nvme_tcp_ctrl *ctrl = to_tcp_ctrl(nctrl);
 	struct nvme_tcp_queue *queue = &ctrl->queues[qid];
 	int ret, rcv_pdu_size;
+	struct file *sock_file;
 
 	mutex_init(&queue->queue_lock);
 	queue->ctrl = ctrl;
@@ -1534,6 +1537,12 @@ static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, int qid)
 		goto err_destroy_mutex;
 	}
 
+	sock_file = sock_alloc_file(queue->sock, O_CLOEXEC, NULL);
+	if (IS_ERR(sock_file)) {
+		sock_release(queue->sock);
+		ret = PTR_ERR(sock_file);
+		goto err_sock;
+	}
 	nvme_tcp_reclassify_socket(queue->sock);
 
 	/* Single syn retry */
@@ -1640,7 +1649,8 @@ static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, int qid)
 	if (queue->hdr_digest || queue->data_digest)
 		nvme_tcp_free_crypto(queue);
 err_sock:
-	sock_release(queue->sock);
+	/* ->sock will be released by fput() */
+	fput(queue->sock->file);
 	queue->sock = NULL;
 err_destroy_mutex:
 	mutex_destroy(&queue->send_mutex);
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 08/14] nvme-tcp: enable TLS handshake upcall
  2023-08-03 10:50 [PATCHv5 00/14] nvme: In-kernel TLS support for TCP Hannes Reinecke
                   ` (6 preceding siblings ...)
  2023-08-03 10:50 ` [PATCH 07/14] nvme/tcp: allocate socket file Hannes Reinecke
@ 2023-08-03 10:50 ` Hannes Reinecke
  2023-08-07  8:20   ` Sagi Grimberg
  2023-08-03 10:50 ` [PATCH 09/14] nvme-tcp: control message handling for recvmsg() Hannes Reinecke
                   ` (5 subsequent siblings)
  13 siblings, 1 reply; 45+ messages in thread
From: Hannes Reinecke @ 2023-08-03 10:50 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Sagi Grimberg, Keith Busch, linux-nvme, Hannes Reinecke

Add a fabrics option 'tls' and start the TLS handshake upcall
with the default PSK. When TLS is started the PSK key serial
number is displayed in the sysfs attribute 'tls_key'

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/nvme/host/Kconfig   |  14 ++++
 drivers/nvme/host/core.c    |   2 +-
 drivers/nvme/host/fabrics.c |  12 ++++
 drivers/nvme/host/fabrics.h |   3 +
 drivers/nvme/host/nvme.h    |   1 +
 drivers/nvme/host/sysfs.c   |  20 ++++++
 drivers/nvme/host/tcp.c     | 130 ++++++++++++++++++++++++++++++++++--
 7 files changed, 175 insertions(+), 7 deletions(-)

diff --git a/drivers/nvme/host/Kconfig b/drivers/nvme/host/Kconfig
index 2f6a7f8c94e8..96a74041bb0a 100644
--- a/drivers/nvme/host/Kconfig
+++ b/drivers/nvme/host/Kconfig
@@ -92,6 +92,20 @@ config NVME_TCP
 
 	  If unsure, say N.
 
+config NVME_TCP_TLS
+	bool "NVMe over Fabrics TCP TLS encryption support"
+	depends on NVME_TCP
+	select NVME_COMMON
+	select NVME_KEYRING
+	select NET_HANDSHAKE
+	help
+	  Enables TLS encryption for NVMe TCP using the netlink handshake API.
+
+	  The TLS handshake daemon is availble at
+	  https://github.com/oracle/ktls-utils.
+
+	  If unsure, say N.
+
 config NVME_AUTH
 	bool "NVM Express over Fabrics In-Band Authentication"
 	depends on NVME_CORE
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index dfc574d0f18d..b52e9c9bffd6 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4380,7 +4380,7 @@ static void nvme_free_ctrl(struct device *dev)
 
 	if (!subsys || ctrl->instance != subsys->instance)
 		ida_free(&nvme_instance_ida, ctrl->instance);
-
+	key_put(ctrl->tls_key);
 	nvme_free_cels(ctrl);
 	nvme_mpath_uninit(ctrl);
 	nvme_auth_stop(ctrl);
diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
index 8175d49f2909..ddad482c3537 100644
--- a/drivers/nvme/host/fabrics.c
+++ b/drivers/nvme/host/fabrics.c
@@ -647,6 +647,9 @@ static const match_table_t opt_tokens = {
 	{ NVMF_OPT_DISCOVERY,		"discovery"		},
 	{ NVMF_OPT_DHCHAP_SECRET,	"dhchap_secret=%s"	},
 	{ NVMF_OPT_DHCHAP_CTRL_SECRET,	"dhchap_ctrl_secret=%s"	},
+#ifdef CONFIG_NVME_TCP_TLS
+	{ NVMF_OPT_TLS,			"tls"			},
+#endif
 	{ NVMF_OPT_ERR,			NULL			}
 };
 
@@ -671,6 +674,7 @@ static int nvmf_parse_options(struct nvmf_ctrl_options *opts,
 	opts->hdr_digest = false;
 	opts->data_digest = false;
 	opts->tos = -1; /* < 0 == use transport default */
+	opts->tls = false;
 
 	options = o = kstrdup(buf, GFP_KERNEL);
 	if (!options)
@@ -955,6 +959,14 @@ static int nvmf_parse_options(struct nvmf_ctrl_options *opts,
 			kfree(opts->dhchap_ctrl_secret);
 			opts->dhchap_ctrl_secret = p;
 			break;
+		case NVMF_OPT_TLS:
+			if (!IS_ENABLED(CONFIG_NVME_TCP_TLS)) {
+				pr_err("TLS is not supported\n");
+				ret = -EINVAL;
+				goto out;
+			}
+			opts->tls = true;
+			break;
 		default:
 			pr_warn("unknown parameter or missing value '%s' in ctrl creation request\n",
 				p);
diff --git a/drivers/nvme/host/fabrics.h b/drivers/nvme/host/fabrics.h
index 82e7a27ffbde..dac17c3fee26 100644
--- a/drivers/nvme/host/fabrics.h
+++ b/drivers/nvme/host/fabrics.h
@@ -70,6 +70,7 @@ enum {
 	NVMF_OPT_DISCOVERY	= 1 << 22,
 	NVMF_OPT_DHCHAP_SECRET	= 1 << 23,
 	NVMF_OPT_DHCHAP_CTRL_SECRET = 1 << 24,
+	NVMF_OPT_TLS		= 1 << 25,
 };
 
 /**
@@ -102,6 +103,7 @@ enum {
  * @dhchap_secret: DH-HMAC-CHAP secret
  * @dhchap_ctrl_secret: DH-HMAC-CHAP controller secret for bi-directional
  *              authentication
+ * @tls:        Start TLS encrypted connections (TCP)
  * @disable_sqflow: disable controller sq flow control
  * @hdr_digest: generate/verify header digest (TCP)
  * @data_digest: generate/verify data digest (TCP)
@@ -128,6 +130,7 @@ struct nvmf_ctrl_options {
 	struct nvmf_host	*host;
 	char			*dhchap_secret;
 	char			*dhchap_ctrl_secret;
+	bool			tls;
 	bool			disable_sqflow;
 	bool			hdr_digest;
 	bool			data_digest;
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index f35647c470af..6fe7966f720b 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -357,6 +357,7 @@ struct nvme_ctrl {
 	struct nvme_dhchap_key *ctrl_key;
 	u16 transaction;
 #endif
+	struct key *tls_key;
 
 	/* Power saving configuration */
 	u64 ps_max_latency_us;
diff --git a/drivers/nvme/host/sysfs.c b/drivers/nvme/host/sysfs.c
index 212e1b05d298..03a64674d0a4 100644
--- a/drivers/nvme/host/sysfs.c
+++ b/drivers/nvme/host/sysfs.c
@@ -527,6 +527,19 @@ static DEVICE_ATTR(dhchap_ctrl_secret, S_IRUGO | S_IWUSR,
 	nvme_ctrl_dhchap_ctrl_secret_show, nvme_ctrl_dhchap_ctrl_secret_store);
 #endif
 
+#ifdef CONFIG_NVME_TCP_TLS
+static ssize_t tls_key_show(struct device *dev,
+			    struct device_attribute *attr, char *buf)
+{
+	struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
+
+	if (!ctrl->tls_key)
+		return 0;
+	return sysfs_emit(buf, "%08x", key_serial(ctrl->tls_key));
+}
+static DEVICE_ATTR_RO(tls_key);
+#endif
+
 static struct attribute *nvme_dev_attrs[] = {
 	&dev_attr_reset_controller.attr,
 	&dev_attr_rescan_controller.attr,
@@ -553,6 +566,9 @@ static struct attribute *nvme_dev_attrs[] = {
 #ifdef CONFIG_NVME_AUTH
 	&dev_attr_dhchap_secret.attr,
 	&dev_attr_dhchap_ctrl_secret.attr,
+#endif
+#ifdef CONFIG_NVME_TCP_TLS
+	&dev_attr_tls_key.attr,
 #endif
 	NULL
 };
@@ -583,6 +599,10 @@ static umode_t nvme_dev_attrs_are_visible(struct kobject *kobj,
 	if (a == &dev_attr_dhchap_ctrl_secret.attr && !ctrl->opts)
 		return 0;
 #endif
+#ifdef CONFIG_NVME_TCP_TLS
+	if (a == &dev_attr_tls_key.attr && !ctrl->opts)
+		return 0;
+#endif
 
 	return a->mode;
 }
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index c3b3b938c0f9..98400ad5f969 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -8,9 +8,13 @@
 #include <linux/init.h>
 #include <linux/slab.h>
 #include <linux/err.h>
+#include <linux/key.h>
 #include <linux/nvme-tcp.h>
+#include <linux/nvme-keyring.h>
 #include <net/sock.h>
 #include <net/tcp.h>
+#include <net/tls.h>
+#include <net/handshake.h>
 #include <linux/blk-mq.h>
 #include <crypto/hash.h>
 #include <net/busy_poll.h>
@@ -31,6 +35,16 @@ static int so_priority;
 module_param(so_priority, int, 0644);
 MODULE_PARM_DESC(so_priority, "nvme tcp socket optimize priority");
 
+#ifdef CONFIG_NVME_TCP_TLS
+/*
+ * TLS handshake timeout
+ */
+static int tls_handshake_timeout = 10;
+module_param(tls_handshake_timeout, int, 0644);
+MODULE_PARM_DESC(tls_handshake_timeout,
+		 "nvme TLS handshake timeout in seconds (default 10)");
+#endif
+
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 /* lockdep can detect a circular dependency of the form
  *   sk_lock -> mmap_lock (page fault) -> fs locks -> sk_lock
@@ -146,7 +160,10 @@ struct nvme_tcp_queue {
 	struct ahash_request	*snd_hash;
 	__le32			exp_ddgst;
 	__le32			recv_ddgst;
-
+#ifdef CONFIG_NVME_TCP_TLS
+	struct completion       tls_complete;
+	int                     tls_err;
+#endif
 	struct page_frag_cache	pf_cache;
 
 	void (*state_change)(struct sock *);
@@ -1509,7 +1526,80 @@ static void nvme_tcp_set_queue_io_cpu(struct nvme_tcp_queue *queue)
 	queue->io_cpu = cpumask_next_wrap(n - 1, cpu_online_mask, -1, false);
 }
 
-static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, int qid)
+#ifdef CONFIG_NVME_TCP_TLS
+static void nvme_tcp_tls_done(void *data, int status, key_serial_t pskid)
+{
+	struct nvme_tcp_queue *queue = data;
+	struct nvme_tcp_ctrl *ctrl = queue->ctrl;
+	int qid = nvme_tcp_queue_id(queue);
+	struct key *tls_key;
+
+	dev_dbg(ctrl->ctrl.device, "queue %d: TLS handshake done, key %x, status %d\n",
+		qid, pskid, status);
+
+	if (status) {
+		queue->tls_err = -status;
+		goto out_complete;
+	}
+
+	tls_key = key_lookup(pskid);
+	if (IS_ERR(tls_key)) {
+		dev_warn(ctrl->ctrl.device, "queue %d: Invalid key %x\n",
+			 qid, pskid);
+		queue->tls_err = -ENOKEY;
+	} else {
+		ctrl->ctrl.tls_key = tls_key;
+		queue->tls_err = 0;
+	}
+
+out_complete:
+	complete(&queue->tls_complete);
+}
+
+static int nvme_tcp_start_tls(struct nvme_ctrl *nctrl,
+			      struct nvme_tcp_queue *queue,
+			      key_serial_t pskid)
+{
+	int qid = nvme_tcp_queue_id(queue);
+	int ret;
+	struct tls_handshake_args args;
+	unsigned long tmo = tls_handshake_timeout * HZ;
+	key_serial_t keyring = nvme_keyring_id();
+
+	dev_dbg(nctrl->device, "queue %d: start TLS with key %x\n",
+		qid, pskid);
+	memset(&args, 0, sizeof(args));
+	args.ta_sock = queue->sock;
+	args.ta_done = nvme_tcp_tls_done;
+	args.ta_data = queue;
+	args.ta_my_peerids[0] = pskid;
+	args.ta_num_peerids = 1;
+	args.ta_keyring = keyring;
+	args.ta_timeout_ms = tls_handshake_timeout * 1000;
+	queue->tls_err = -EOPNOTSUPP;
+	init_completion(&queue->tls_complete);
+	ret = tls_client_hello_psk(&args, GFP_KERNEL);
+	if (ret) {
+		dev_err(nctrl->device, "queue %d: failed to start TLS: %d\n",
+			qid, ret);
+		return ret;
+	}
+	if (wait_for_completion_timeout(&queue->tls_complete, tmo) == 0) {
+		dev_err(nctrl->device,
+			"queue %d: TLS handshake timeout\n", qid);
+		ret = -ETIMEDOUT;
+	} else {
+		dev_dbg(nctrl->device,
+			"queue %d: TLS handshake complete, error %d\n",
+			qid, queue->tls_err);
+		ret = queue->tls_err;
+	}
+	return ret;
+}
+#endif
+
+static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, int qid,
+				key_serial_t pskid)
 {
 	struct nvme_tcp_ctrl *ctrl = to_tcp_ctrl(nctrl);
 	struct nvme_tcp_queue *queue = &ctrl->queues[qid];
@@ -1633,6 +1723,14 @@ static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, int qid)
 		goto err_rcv_pdu;
 	}
 
+#ifdef CONFIG_NVME_TCP_TLS
+	/* If PSKs are configured try to start TLS */
+	if (pskid) {
+		ret = nvme_tcp_start_tls(nctrl, queue, pskid);
+		if (ret)
+			goto err_init_connect;
+	}
+#endif
 	ret = nvme_tcp_init_connection(queue);
 	if (ret)
 		goto err_init_connect;
@@ -1782,10 +1880,22 @@ static int nvme_tcp_start_io_queues(struct nvme_ctrl *ctrl,
 static int nvme_tcp_alloc_admin_queue(struct nvme_ctrl *ctrl)
 {
 	int ret;
+	key_serial_t psk_id = 0;
+
+	if (ctrl->opts->tls) {
+		psk_id = nvme_tls_psk_default(NULL,
+					      ctrl->opts->host->nqn,
+					      ctrl->opts->subsysnqn);
+		if (!psk_id) {
+			dev_err(ctrl->device, "no valid PSK found\n");
+			ret = -ENOKEY;
+			goto out_free_queue;
+		}
+	}
 
-	ret = nvme_tcp_alloc_queue(ctrl, 0);
+	ret = nvme_tcp_alloc_queue(ctrl, 0, psk_id);
 	if (ret)
-		return ret;
+		goto out_free_queue;
 
 	ret = nvme_tcp_alloc_async_req(to_tcp_ctrl(ctrl));
 	if (ret)
@@ -1801,9 +1911,17 @@ static int nvme_tcp_alloc_admin_queue(struct nvme_ctrl *ctrl)
 static int __nvme_tcp_alloc_io_queues(struct nvme_ctrl *ctrl)
 {
 	int i, ret;
+	key_serial_t psk_id = 0;
 
+	if (ctrl->opts->tls) {
+		if (!ctrl->tls_key) {
+			dev_err(ctrl->device, "no PSK negotiated\n");
+			return -ENOKEY;
+		}
+		psk_id = key_serial(ctrl->tls_key);
+	}
 	for (i = 1; i < ctrl->queue_count; i++) {
-		ret = nvme_tcp_alloc_queue(ctrl, i);
+		ret = nvme_tcp_alloc_queue(ctrl, i, psk_id);
 		if (ret)
 			goto out_free_queues;
 	}
@@ -2630,7 +2748,7 @@ static struct nvmf_transport_ops nvme_tcp_transport = {
 			  NVMF_OPT_HOST_TRADDR | NVMF_OPT_CTRL_LOSS_TMO |
 			  NVMF_OPT_HDR_DIGEST | NVMF_OPT_DATA_DIGEST |
 			  NVMF_OPT_NR_WRITE_QUEUES | NVMF_OPT_NR_POLL_QUEUES |
-			  NVMF_OPT_TOS | NVMF_OPT_HOST_IFACE,
+			  NVMF_OPT_TOS | NVMF_OPT_HOST_IFACE | NVMF_OPT_TLS,
 	.create_ctrl	= nvme_tcp_create_ctrl,
 };
 
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 09/14] nvme-tcp: control message handling for recvmsg()
  2023-08-03 10:50 [PATCHv5 00/14] nvme: In-kernel TLS support for TCP Hannes Reinecke
                   ` (7 preceding siblings ...)
  2023-08-03 10:50 ` [PATCH 08/14] nvme-tcp: enable TLS handshake upcall Hannes Reinecke
@ 2023-08-03 10:50 ` Hannes Reinecke
  2023-08-07  8:22   ` Sagi Grimberg
  2023-08-03 10:50 ` [PATCH 10/14] nvme-fabrics: parse options 'keyring' and 'tls_key' Hannes Reinecke
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 45+ messages in thread
From: Hannes Reinecke @ 2023-08-03 10:50 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Sagi Grimberg, Keith Busch, linux-nvme, Hannes Reinecke

kTLS is sending TLS ALERT messages as control messages for recvmsg().
As we can't do anything sensible with it just abort the connection
and let the userspace agent to a re-negotiation.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/tcp.c | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 98400ad5f969..4d0e3de39c26 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -14,6 +14,7 @@
 #include <net/sock.h>
 #include <net/tcp.h>
 #include <net/tls.h>
+#include <net/tls_prot.h>
 #include <net/handshake.h>
 #include <linux/blk-mq.h>
 #include <crypto/hash.h>
@@ -1369,6 +1370,10 @@ static int nvme_tcp_init_connection(struct nvme_tcp_queue *queue)
 {
 	struct nvme_tcp_icreq_pdu *icreq;
 	struct nvme_tcp_icresp_pdu *icresp;
+#ifdef CONFIG_NVME_TCP_TLS
+	char cbuf[CMSG_LEN(sizeof(char))] = {};
+	u8 ctype;
+#endif
 	struct msghdr msg = {};
 	struct kvec iov;
 	bool ctrl_hdgst, ctrl_ddgst;
@@ -1406,11 +1411,22 @@ static int nvme_tcp_init_connection(struct nvme_tcp_queue *queue)
 	memset(&msg, 0, sizeof(msg));
 	iov.iov_base = icresp;
 	iov.iov_len = sizeof(*icresp);
+#ifdef CONFIG_NVME_TCP_TLS
+	msg.msg_control = cbuf;
+	msg.msg_controllen = sizeof(cbuf);
+#endif
 	ret = kernel_recvmsg(queue->sock, &msg, &iov, 1,
 			iov.iov_len, msg.msg_flags);
 	if (ret < 0)
 		goto free_icresp;
-
+#ifdef CONFIG_NVME_TCP_TLS
+	ctype = tls_get_record_type(queue->sock->sk, (struct cmsghdr *)cbuf);
+	if (ctype != TLS_RECORD_TYPE_DATA) {
+		pr_err("queue %d: unhandled TLS record %d\n",
+		       nvme_tcp_queue_id(queue), ctype);
+		return -ENOTCONN;
+	}
+#endif
 	ret = -EINVAL;
 	if (icresp->hdr.type != nvme_tcp_icresp) {
 		pr_err("queue %d: bad type returned %d\n",
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 10/14] nvme-fabrics: parse options 'keyring' and 'tls_key'
  2023-08-03 10:50 [PATCHv5 00/14] nvme: In-kernel TLS support for TCP Hannes Reinecke
                   ` (8 preceding siblings ...)
  2023-08-03 10:50 ` [PATCH 09/14] nvme-tcp: control message handling for recvmsg() Hannes Reinecke
@ 2023-08-03 10:50 ` Hannes Reinecke
  2023-08-07  8:23   ` Sagi Grimberg
  2023-08-03 10:50 ` [PATCH 11/14] nvmet: make TCP sectype settable via configfs Hannes Reinecke
                   ` (3 subsequent siblings)
  13 siblings, 1 reply; 45+ messages in thread
From: Hannes Reinecke @ 2023-08-03 10:50 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Sagi Grimberg, Keith Busch, linux-nvme, Hannes Reinecke

Parse the fabrics options 'keyring' and 'tls_key' and store the
referenced keys in the options structure.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/nvme/host/fabrics.c | 69 ++++++++++++++++++++++++++++++++++++-
 drivers/nvme/host/fabrics.h |  6 ++++
 drivers/nvme/host/tcp.c     | 11 ++++--
 3 files changed, 82 insertions(+), 4 deletions(-)

diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
index ddad482c3537..6bdcd505a477 100644
--- a/drivers/nvme/host/fabrics.c
+++ b/drivers/nvme/host/fabrics.c
@@ -643,6 +643,10 @@ static const match_table_t opt_tokens = {
 	{ NVMF_OPT_NR_WRITE_QUEUES,	"nr_write_queues=%d"	},
 	{ NVMF_OPT_NR_POLL_QUEUES,	"nr_poll_queues=%d"	},
 	{ NVMF_OPT_TOS,			"tos=%d"		},
+#ifdef CONFIG_NVME_TCP_TLS
+	{ NVMF_OPT_KEYRING,		"keyring=%d"		},
+	{ NVMF_OPT_TLS_KEY,		"tls_key=%d"		},
+#endif
 	{ NVMF_OPT_FAIL_FAST_TMO,	"fast_io_fail_tmo=%d"	},
 	{ NVMF_OPT_DISCOVERY,		"discovery"		},
 	{ NVMF_OPT_DHCHAP_SECRET,	"dhchap_secret=%s"	},
@@ -660,9 +664,10 @@ static int nvmf_parse_options(struct nvmf_ctrl_options *opts,
 	char *options, *o, *p;
 	int token, ret = 0;
 	size_t nqnlen  = 0;
-	int ctrl_loss_tmo = NVMF_DEF_CTRL_LOSS_TMO;
+	int ctrl_loss_tmo = NVMF_DEF_CTRL_LOSS_TMO, key_id;
 	uuid_t hostid;
 	char hostnqn[NVMF_NQN_SIZE];
+	struct key *key = NULL;
 
 	/* Set defaults */
 	opts->queue_size = NVMF_DEF_QUEUE_SIZE;
@@ -928,6 +933,66 @@ static int nvmf_parse_options(struct nvmf_ctrl_options *opts,
 			}
 			opts->tos = token;
 			break;
+		case NVMF_OPT_KEYRING:
+			if (!IS_ENABLED(CONFIG_NVME_TCP_TLS)) {
+				pr_err("TLS is not supported\n");
+				ret = -EINVAL;
+				goto out;
+			}
+			if (match_int(args, &key_id)) {
+				ret = -EINVAL;
+				goto out;
+			}
+			if (key_id < 0) {
+				pr_err("Invalid keyring id %d\n", key_id);
+				ret = -EINVAL;
+				goto out;
+			}
+			if (!key_id) {
+				pr_debug("Using default keyring\n");
+				key_put(opts->keyring);
+				opts->keyring = NULL;
+				break;
+			}
+			key = key_lookup(key_id);
+			if (!key) {
+				pr_err("Keyring id %08x not found\n", key_id);
+				ret = -ENOKEY;
+				goto out;
+			}
+			key_put(opts->keyring);
+			opts->keyring = key;
+			break;
+		case NVMF_OPT_TLS_KEY:
+			if (!IS_ENABLED(CONFIG_NVME_TCP_TLS)) {
+				pr_err("TLS is not supported\n");
+				ret = -EINVAL;
+				goto out;
+			}
+			if (match_int(args, &key_id)) {
+				ret = -EINVAL;
+				goto out;
+			}
+			if (key_id < 0) {
+				pr_err("Invalid key id %d\n", key_id);
+				ret = -EINVAL;
+				goto out;
+			}
+			if (!key_id) {
+				pr_debug("Using 'best' PSK\n");
+				key_put(opts->tls_key);
+				opts->tls_key = NULL;
+				break;
+			}
+			key = key_lookup(key_id);
+			if (!key) {
+				pr_err("Key id %08x not found\n", key_id);
+				ret = -ENOKEY;
+				goto out;
+			}
+			key_put(opts->tls_key);
+			opts->tls_key = key;
+			break;
 		case NVMF_OPT_DISCOVERY:
 			opts->discovery_nqn = true;
 			break;
@@ -1168,6 +1233,8 @@ static int nvmf_check_allowed_opts(struct nvmf_ctrl_options *opts,
 void nvmf_free_options(struct nvmf_ctrl_options *opts)
 {
 	nvmf_host_put(opts->host);
+	key_put(opts->keyring);
+	key_put(opts->tls_key);
 	kfree(opts->transport);
 	kfree(opts->traddr);
 	kfree(opts->trsvcid);
diff --git a/drivers/nvme/host/fabrics.h b/drivers/nvme/host/fabrics.h
index dac17c3fee26..fbaee5a7be19 100644
--- a/drivers/nvme/host/fabrics.h
+++ b/drivers/nvme/host/fabrics.h
@@ -71,6 +71,8 @@ enum {
 	NVMF_OPT_DHCHAP_SECRET	= 1 << 23,
 	NVMF_OPT_DHCHAP_CTRL_SECRET = 1 << 24,
 	NVMF_OPT_TLS		= 1 << 25,
+	NVMF_OPT_KEYRING	= 1 << 26,
+	NVMF_OPT_TLS_KEY	= 1 << 27,
 };
 
 /**
@@ -103,6 +105,8 @@ enum {
  * @dhchap_secret: DH-HMAC-CHAP secret
  * @dhchap_ctrl_secret: DH-HMAC-CHAP controller secret for bi-directional
  *              authentication
+ * @keyring:    Keyring to use for key lookups
+ * @tls_key:    TLS key for encrypted connections (TCP)
  * @tls:        Start TLS encrypted connections (TCP)
  * @disable_sqflow: disable controller sq flow control
  * @hdr_digest: generate/verify header digest (TCP)
@@ -130,6 +134,8 @@ struct nvmf_ctrl_options {
 	struct nvmf_host	*host;
 	char			*dhchap_secret;
 	char			*dhchap_ctrl_secret;
+	struct key		*keyring;
+	struct key		*tls_key;
 	bool			tls;
 	bool			disable_sqflow;
 	bool			hdr_digest;
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 4d0e3de39c26..cc6a672b8d3f 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1584,6 +1584,8 @@ static int nvme_tcp_start_tls(struct nvme_ctrl *nctrl,
 
 	dev_dbg(nctrl->device, "queue %d: start TLS with key %x\n",
 		qid, pskid);
+	if (nctrl->opts->keyring)
+		keyring = key_serial(nctrl->opts->keyring);
 	memset(&args, 0, sizeof(args));
 	args.ta_sock = queue->sock;
 	args.ta_done = nvme_tcp_tls_done;
@@ -1899,9 +1901,12 @@ static int nvme_tcp_alloc_admin_queue(struct nvme_ctrl *ctrl)
 	key_serial_t psk_id = 0;
 
 	if (ctrl->opts->tls) {
-		psk_id = nvme_tls_psk_default(NULL,
-					      ctrl->opts->host->nqn,
-					      ctrl->opts->subsysnqn);
+		if (ctrl->opts->tls_key)
+			psk_id = key_serial(ctrl->opts->tls_key);
+		else
+			psk_id = nvme_tls_psk_default(ctrl->opts->keyring,
+						      ctrl->opts->host->nqn,
+						      ctrl->opts->subsysnqn);
 		if (!psk_id) {
 			dev_err(ctrl->device, "no valid PSK found\n");
 			ret = -ENOKEY;
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 11/14] nvmet: make TCP sectype settable via configfs
  2023-08-03 10:50 [PATCHv5 00/14] nvme: In-kernel TLS support for TCP Hannes Reinecke
                   ` (9 preceding siblings ...)
  2023-08-03 10:50 ` [PATCH 10/14] nvme-fabrics: parse options 'keyring' and 'tls_key' Hannes Reinecke
@ 2023-08-03 10:50 ` Hannes Reinecke
  2023-08-07  8:25   ` Sagi Grimberg
  2023-08-03 10:51 ` [PATCH 12/14] nvmet-tcp: allocate socket file Hannes Reinecke
                   ` (2 subsequent siblings)
  13 siblings, 1 reply; 45+ messages in thread
From: Hannes Reinecke @ 2023-08-03 10:50 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Sagi Grimberg, Keith Busch, linux-nvme, Hannes Reinecke

Add a new configfs attribute 'addr_tsas' to make the TCP sectype
settable via configfs.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/nvme/target/configfs.c | 65 ++++++++++++++++++++++++++++++++++
 1 file changed, 65 insertions(+)

diff --git a/drivers/nvme/target/configfs.c b/drivers/nvme/target/configfs.c
index 907143870da5..7f826ac8b75c 100644
--- a/drivers/nvme/target/configfs.c
+++ b/drivers/nvme/target/configfs.c
@@ -303,6 +303,11 @@ static void nvmet_port_init_tsas_rdma(struct nvmet_port *port)
 	port->disc_addr.tsas.rdma.cms = NVMF_RDMA_CMS_RDMA_CM;
 }
 
+static void nvmet_port_init_tsas_tcp(struct nvmet_port *port, int sectype)
+{
+	port->disc_addr.tsas.tcp.sectype = sectype;
+}
+
 static ssize_t nvmet_addr_trtype_store(struct config_item *item,
 		const char *page, size_t count)
 {
@@ -325,11 +330,70 @@ static ssize_t nvmet_addr_trtype_store(struct config_item *item,
 	port->disc_addr.trtype = nvmet_transport[i].type;
 	if (port->disc_addr.trtype == NVMF_TRTYPE_RDMA)
 		nvmet_port_init_tsas_rdma(port);
+	else if (port->disc_addr.trtype == NVMF_TRTYPE_TCP)
+		nvmet_port_init_tsas_tcp(port, NVMF_TCP_SECTYPE_NONE);
 	return count;
 }
 
 CONFIGFS_ATTR(nvmet_, addr_trtype);
 
+static const struct nvmet_type_name_map nvmet_addr_tsas_tcp[] = {
+	{ NVMF_TCP_SECTYPE_NONE,	"none" },
+	{ NVMF_TCP_SECTYPE_TLS13,	"tls1.3" },
+};
+
+static const struct nvmet_type_name_map nvmet_addr_tsas_rdma[] = {
+	{ NVMF_RDMA_QPTYPE_CONNECTED,	"connected" },
+	{ NVMF_RDMA_QPTYPE_DATAGRAM,	"datagram"  },
+};
+
+static ssize_t nvmet_addr_tsas_show(struct config_item *item,
+		char *page)
+{
+	struct nvmet_port *port = to_nvmet_port(item);
+	int i;
+
+	if (port->disc_addr.trtype == NVMF_TRTYPE_TCP) {
+		for (i = 0; i < ARRAY_SIZE(nvmet_addr_tsas_tcp); i++) {
+			if (port->disc_addr.tsas.tcp.sectype == nvmet_addr_tsas_tcp[i].type)
+				return sprintf(page, "%s\n", nvmet_addr_tsas_tcp[i].name);
+		}
+	} else if (port->disc_addr.trtype == NVMF_TRTYPE_RDMA) {
+		for (i = 0; i < ARRAY_SIZE(nvmet_addr_tsas_rdma); i++) {
+			if (port->disc_addr.tsas.rdma.qptype == nvmet_addr_tsas_rdma[i].type)
+				return sprintf(page, "%s\n", nvmet_addr_tsas_rdma[i].name);
+		}
+	}
+	return sprintf(page, "reserved\n");
+}
+
+static ssize_t nvmet_addr_tsas_store(struct config_item *item,
+		const char *page, size_t count)
+{
+	struct nvmet_port *port = to_nvmet_port(item);
+	int i;
+
+	if (nvmet_is_port_enabled(port, __func__))
+		return -EACCES;
+
+	if (port->disc_addr.trtype != NVMF_TRTYPE_TCP)
+		return -EINVAL;
+
+	for (i = 0; i < ARRAY_SIZE(nvmet_addr_tsas_tcp); i++) {
+		if (sysfs_streq(page, nvmet_addr_tsas_tcp[i].name))
+			goto found;
+	}
+
+	pr_err("Invalid value '%s' for tsas\n", page);
+	return -EINVAL;
+
+found:
+	nvmet_port_init_tsas_tcp(port, nvmet_addr_tsas_tcp[i].type);
+	return count;
+}
+
+CONFIGFS_ATTR(nvmet_, addr_tsas);
+
 /*
  * Namespace structures & file operation functions below
  */
@@ -1741,6 +1805,7 @@ static struct configfs_attribute *nvmet_port_attrs[] = {
 	&nvmet_attr_addr_traddr,
 	&nvmet_attr_addr_trsvcid,
 	&nvmet_attr_addr_trtype,
+	&nvmet_attr_addr_tsas,
 	&nvmet_attr_param_inline_data_size,
 #ifdef CONFIG_BLK_DEV_INTEGRITY
 	&nvmet_attr_param_pi_enable,
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 12/14] nvmet-tcp: allocate socket file
  2023-08-03 10:50 [PATCHv5 00/14] nvme: In-kernel TLS support for TCP Hannes Reinecke
                   ` (10 preceding siblings ...)
  2023-08-03 10:50 ` [PATCH 11/14] nvmet: make TCP sectype settable via configfs Hannes Reinecke
@ 2023-08-03 10:51 ` Hannes Reinecke
  2023-08-07  8:27   ` Sagi Grimberg
  2023-08-03 10:51 ` [PATCH 13/14] nvmet-tcp: enable TLS handshake upcall Hannes Reinecke
  2023-08-03 10:51 ` [PATCH 14/14] nvmet-tcp: control messages for recvmsg() Hannes Reinecke
  13 siblings, 1 reply; 45+ messages in thread
From: Hannes Reinecke @ 2023-08-03 10:51 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Sagi Grimberg, Keith Busch, linux-nvme, Hannes Reinecke

For the TLS upcall we need to allocate a socket file such
that the userspace daemon is able to use the socket.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/nvme/target/tcp.c | 29 +++++++++++++++++------------
 1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
index 868aa4de2e4c..fdc351f591a4 100644
--- a/drivers/nvme/target/tcp.c
+++ b/drivers/nvme/target/tcp.c
@@ -1493,12 +1493,12 @@ static void nvmet_tcp_release_queue_work(struct work_struct *w)
 	nvmet_sq_destroy(&queue->nvme_sq);
 	cancel_work_sync(&queue->io_work);
 	nvmet_tcp_free_cmd_data_in_buffers(queue);
-	sock_release(queue->sock);
+	/* ->sock will be released by fput() */
+	fput(queue->sock->file);
 	nvmet_tcp_free_cmds(queue);
 	if (queue->hdr_digest || queue->data_digest)
 		nvmet_tcp_free_crypto(queue);
 	ida_free(&nvmet_tcp_queue_ida, queue->idx);
-
 	page = virt_to_head_page(queue->pf_cache.va);
 	__page_frag_cache_drain(page, queue->pf_cache.pagecnt_bias);
 	kfree(queue);
@@ -1621,15 +1621,16 @@ static int nvmet_tcp_set_queue_sock(struct nvmet_tcp_queue *queue)
 	return ret;
 }
 
-static int nvmet_tcp_alloc_queue(struct nvmet_tcp_port *port,
+static void nvmet_tcp_alloc_queue(struct nvmet_tcp_port *port,
 		struct socket *newsock)
 {
 	struct nvmet_tcp_queue *queue;
+	struct file *sock_file;
 	int ret;
 
 	queue = kzalloc(sizeof(*queue), GFP_KERNEL);
 	if (!queue)
-		return -ENOMEM;
+		return;
 
 	INIT_WORK(&queue->release_work, nvmet_tcp_release_queue_work);
 	INIT_WORK(&queue->io_work, nvmet_tcp_io_work);
@@ -1642,10 +1643,16 @@ static int nvmet_tcp_alloc_queue(struct nvmet_tcp_port *port,
 	init_llist_head(&queue->resp_list);
 	INIT_LIST_HEAD(&queue->resp_send_list);
 
+	sock_file = sock_alloc_file(queue->sock, O_CLOEXEC, NULL);
+	if (IS_ERR(sock_file)) {
+		ret = PTR_ERR(sock_file);
+		goto out_free_queue;
+	}
+
 	queue->idx = ida_alloc(&nvmet_tcp_queue_ida, GFP_KERNEL);
 	if (queue->idx < 0) {
 		ret = queue->idx;
-		goto out_free_queue;
+		goto out_sock;
 	}
 
 	ret = nvmet_tcp_alloc_cmd(queue, &queue->connect);
@@ -1666,7 +1673,7 @@ static int nvmet_tcp_alloc_queue(struct nvmet_tcp_port *port,
 	if (ret)
 		goto out_destroy_sq;
 
-	return 0;
+	return;
 out_destroy_sq:
 	mutex_lock(&nvmet_tcp_queue_mutex);
 	list_del_init(&queue->queue_list);
@@ -1676,9 +1683,11 @@ static int nvmet_tcp_alloc_queue(struct nvmet_tcp_port *port,
 	nvmet_tcp_free_cmd(&queue->connect);
 out_ida_remove:
 	ida_free(&nvmet_tcp_queue_ida, queue->idx);
+out_sock:
+	fput(queue->sock->file);
 out_free_queue:
 	kfree(queue);
-	return ret;
+	pr_err("failed to allocate queue");
 }
 
 static void nvmet_tcp_accept_work(struct work_struct *w)
@@ -1695,11 +1704,7 @@ static void nvmet_tcp_accept_work(struct work_struct *w)
 				pr_warn("failed to accept err=%d\n", ret);
 			return;
 		}
-		ret = nvmet_tcp_alloc_queue(port, newsock);
-		if (ret) {
-			pr_err("failed to allocate queue\n");
-			sock_release(newsock);
-		}
+		nvmet_tcp_alloc_queue(port, newsock);
 	}
 }
 
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 13/14] nvmet-tcp: enable TLS handshake upcall
  2023-08-03 10:50 [PATCHv5 00/14] nvme: In-kernel TLS support for TCP Hannes Reinecke
                   ` (11 preceding siblings ...)
  2023-08-03 10:51 ` [PATCH 12/14] nvmet-tcp: allocate socket file Hannes Reinecke
@ 2023-08-03 10:51 ` Hannes Reinecke
  2023-08-07  8:51   ` Sagi Grimberg
  2023-08-03 10:51 ` [PATCH 14/14] nvmet-tcp: control messages for recvmsg() Hannes Reinecke
  13 siblings, 1 reply; 45+ messages in thread
From: Hannes Reinecke @ 2023-08-03 10:51 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Sagi Grimberg, Keith Busch, linux-nvme, Hannes Reinecke

Add functions to start the TLS handshake upcall when
the TCP TSAS sectype is set to 'tls1.3' and add a config
option NVME_TARGET_TCP_TLS.

Signed-off-by: Hannes Reincke <hare@suse.de>
---
 drivers/nvme/target/Kconfig    |  14 ++++
 drivers/nvme/target/configfs.c |  63 +++++++++++++++-
 drivers/nvme/target/nvmet.h    |   1 +
 drivers/nvme/target/tcp.c      | 133 +++++++++++++++++++++++++++++++--
 4 files changed, 203 insertions(+), 8 deletions(-)

diff --git a/drivers/nvme/target/Kconfig b/drivers/nvme/target/Kconfig
index 79fc64035ee3..15af2b4341f6 100644
--- a/drivers/nvme/target/Kconfig
+++ b/drivers/nvme/target/Kconfig
@@ -84,6 +84,20 @@ config NVME_TARGET_TCP
 
 	  If unsure, say N.
 
+config NVME_TARGET_TCP_TLS
+	bool "NVMe over Fabrics TCP target TLS encryption support"
+	depends on NVME_TARGET_TCP
+	select NVME_COMMON
+	select NVME_KEYRING
+	select NET_HANDSHAKE
+	help
+	  Enables TLS encryption for the NVMe TCP target using the netlink handshake API.
+
+	  The TLS handshake daemon is availble at
+	  https://github.com/oracle/ktls-utils.
+
+	  If unsure, say N.
+
 config NVME_TARGET_AUTH
 	bool "NVMe over Fabrics In-band Authentication support"
 	depends on NVME_TARGET
diff --git a/drivers/nvme/target/configfs.c b/drivers/nvme/target/configfs.c
index 7f826ac8b75c..49b407702ad5 100644
--- a/drivers/nvme/target/configfs.c
+++ b/drivers/nvme/target/configfs.c
@@ -15,6 +15,7 @@
 #ifdef CONFIG_NVME_TARGET_AUTH
 #include <linux/nvme-auth.h>
 #endif
+#include <linux/nvme-keyring.h>
 #include <crypto/hash.h>
 #include <crypto/kpp.h>
 
@@ -159,10 +160,15 @@ static const struct nvmet_type_name_map nvmet_addr_treq[] = {
 	{ NVMF_TREQ_NOT_REQUIRED,	"not required" },
 };
 
+static inline u8 nvmet_port_treq(struct nvmet_port *port)
+{
+	return (port->disc_addr.treq & NVME_TREQ_SECURE_CHANNEL_MASK);
+}
+
 static ssize_t nvmet_addr_treq_show(struct config_item *item, char *page)
 {
-	u8 treq = to_nvmet_port(item)->disc_addr.treq &
-		NVME_TREQ_SECURE_CHANNEL_MASK;
+	struct nvmet_port *port = to_nvmet_port(item);
+	u8 treq = nvmet_port_treq(port);
 	int i;
 
 	for (i = 0; i < ARRAY_SIZE(nvmet_addr_treq); i++) {
@@ -174,11 +180,16 @@ static ssize_t nvmet_addr_treq_show(struct config_item *item, char *page)
 	return snprintf(page, PAGE_SIZE, "\n");
 }
 
+static inline u8 nvmet_port_treq_mask(struct nvmet_port *port)
+{
+	return (port->disc_addr.treq & ~NVME_TREQ_SECURE_CHANNEL_MASK);
+}
+
 static ssize_t nvmet_addr_treq_store(struct config_item *item,
 		const char *page, size_t count)
 {
 	struct nvmet_port *port = to_nvmet_port(item);
-	u8 treq = port->disc_addr.treq & ~NVME_TREQ_SECURE_CHANNEL_MASK;
+	u8 treq = nvmet_port_treq_mask(port);
 	int i;
 
 	if (nvmet_is_port_enabled(port, __func__))
@@ -193,6 +204,23 @@ static ssize_t nvmet_addr_treq_store(struct config_item *item,
 	return -EINVAL;
 
 found:
+	if (port->disc_addr.trtype == NVMF_TRTYPE_TCP) {
+		if (!IS_ENABLED(CONFIG_NVME_TARGET_TCP_TLS)) {
+			pr_err("TLS is not supported\n");
+			return -EINVAL;
+		}
+		if (!port->keyring) {
+			pr_err("TLS keyring not configured\n");
+			return -EINVAL;
+		}
+		if (port->disc_addr.tsas.tcp.sectype != NVMF_TCP_SECTYPE_TLS13) {
+			pr_warn("cannot change TREQ when TLS is not enabled\n");
+			return -EINVAL;
+		} else if (nvmet_addr_treq[i].type == NVMF_TREQ_NOT_SPECIFIED) {
+			pr_warn("cannot set TREQ to 'not specified' when TLS is enabled\n");
+			return -EINVAL;
+		}
+	}
 	treq |= nvmet_addr_treq[i].type;
 	port->disc_addr.treq = treq;
 	return count;
@@ -371,6 +399,7 @@ static ssize_t nvmet_addr_tsas_store(struct config_item *item,
 		const char *page, size_t count)
 {
 	struct nvmet_port *port = to_nvmet_port(item);
+	u8 treq = nvmet_port_treq_mask(port);
 	int i;
 
 	if (nvmet_is_port_enabled(port, __func__))
@@ -379,6 +408,15 @@ static ssize_t nvmet_addr_tsas_store(struct config_item *item,
 	if (port->disc_addr.trtype != NVMF_TRTYPE_TCP)
 		return -EINVAL;
 
+	if (!IS_ENABLED(CONFIG_NVME_TARGET_TCP_TLS)) {
+		pr_err("TLS is not supported\n");
+		return -EINVAL;
+	}
+	if (!port->keyring) {
+		pr_err("TLS keyring not configured\n");
+		return -EINVAL;
+	}
+
 	for (i = 0; i < ARRAY_SIZE(nvmet_addr_tsas_tcp); i++) {
 		if (sysfs_streq(page, nvmet_addr_tsas_tcp[i].name))
 			goto found;
@@ -389,6 +427,16 @@ static ssize_t nvmet_addr_tsas_store(struct config_item *item,
 
 found:
 	nvmet_port_init_tsas_tcp(port, nvmet_addr_tsas_tcp[i].type);
+	if (nvmet_addr_tsas_tcp[i].type == NVMF_TCP_SECTYPE_TLS13) {
+		if (nvmet_port_treq(port) == NVMF_TREQ_NOT_SPECIFIED)
+			treq |= NVMF_TREQ_REQUIRED;
+		else
+			treq |= nvmet_port_treq(port);
+	} else {
+		/* Set to 'not specified' if TLS is not enabled */
+		treq |= NVMF_TREQ_NOT_SPECIFIED;
+	}
+	port->disc_addr.treq = treq;
 	return count;
 }
 
@@ -1795,6 +1843,7 @@ static void nvmet_port_release(struct config_item *item)
 	flush_workqueue(nvmet_wq);
 	list_del(&port->global_entry);
 
+	key_put(port->keyring);
 	kfree(port->ana_state);
 	kfree(port);
 }
@@ -1844,6 +1893,14 @@ static struct config_group *nvmet_ports_make(struct config_group *group,
 		return ERR_PTR(-ENOMEM);
 	}
 
+	if (nvme_keyring_id()) {
+		port->keyring = key_lookup(nvme_keyring_id());
+		if (IS_ERR(port->keyring)) {
+			pr_warn("NVMe keyring not available, disabling TLS\n");
+			port->keyring = NULL;
+		}
+	}
+
 	for (i = 1; i <= NVMET_MAX_ANAGRPS; i++) {
 		if (i == NVMET_DEFAULT_ANA_GRPID)
 			port->ana_state[1] = NVME_ANA_OPTIMIZED;
diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h
index 8cfd60f3b564..7f9ae53c1df5 100644
--- a/drivers/nvme/target/nvmet.h
+++ b/drivers/nvme/target/nvmet.h
@@ -158,6 +158,7 @@ struct nvmet_port {
 	struct config_group		ana_groups_group;
 	struct nvmet_ana_group		ana_default_group;
 	enum nvme_ana_state		*ana_state;
+	struct key			*keyring;
 	void				*priv;
 	bool				enabled;
 	int				inline_data_size;
diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
index fdc351f591a4..7279c994abd6 100644
--- a/drivers/nvme/target/tcp.c
+++ b/drivers/nvme/target/tcp.c
@@ -8,9 +8,13 @@
 #include <linux/init.h>
 #include <linux/slab.h>
 #include <linux/err.h>
+#include <linux/key.h>
 #include <linux/nvme-tcp.h>
+#include <linux/nvme-keyring.h>
 #include <net/sock.h>
 #include <net/tcp.h>
+#include <net/tls.h>
+#include <net/handshake.h>
 #include <linux/inet.h>
 #include <linux/llist.h>
 #include <crypto/hash.h>
@@ -66,6 +70,16 @@ device_param_cb(idle_poll_period_usecs, &set_param_ops,
 MODULE_PARM_DESC(idle_poll_period_usecs,
 		"nvmet tcp io_work poll till idle time period in usecs: Default 0");
 
+#ifdef CONFIG_NVME_TARGET_TCP_TLS
+/*
+ * TLS handshake timeout
+ */
+static int tls_handshake_timeout = 30;
+module_param(tls_handshake_timeout, int, 0644);
+MODULE_PARM_DESC(tls_handshake_timeout,
+		 "nvme TLS handshake timeout in seconds (default 30)");
+#endif
+
 #define NVMET_TCP_RECV_BUDGET		8
 #define NVMET_TCP_SEND_BUDGET		8
 #define NVMET_TCP_IO_WORK_BUDGET	64
@@ -122,6 +136,7 @@ struct nvmet_tcp_cmd {
 
 enum nvmet_tcp_queue_state {
 	NVMET_TCP_Q_CONNECTING,
+	NVMET_TCP_Q_TLS_HANDSHAKE,
 	NVMET_TCP_Q_LIVE,
 	NVMET_TCP_Q_DISCONNECTING,
 };
@@ -154,6 +169,8 @@ struct nvmet_tcp_queue {
 	bool			data_digest;
 	struct ahash_request	*snd_hash;
 	struct ahash_request	*rcv_hash;
+	key_serial_t		tls_pskid;
+	struct delayed_work	tls_handshake_work;
 
 	unsigned long           poll_end;
 
@@ -1285,12 +1302,12 @@ static int nvmet_tcp_try_recv(struct nvmet_tcp_queue *queue,
 
 static void nvmet_tcp_schedule_release_queue(struct nvmet_tcp_queue *queue)
 {
-	spin_lock(&queue->state_lock);
+	spin_lock_irq(&queue->state_lock);
 	if (queue->state != NVMET_TCP_Q_DISCONNECTING) {
 		queue->state = NVMET_TCP_Q_DISCONNECTING;
 		queue_work(nvmet_wq, &queue->release_work);
 	}
-	spin_unlock(&queue->state_lock);
+	spin_unlock_irq(&queue->state_lock);
 }
 
 static inline void nvmet_tcp_arm_queue_deadline(struct nvmet_tcp_queue *queue)
@@ -1512,8 +1529,12 @@ static void nvmet_tcp_data_ready(struct sock *sk)
 
 	read_lock_bh(&sk->sk_callback_lock);
 	queue = sk->sk_user_data;
-	if (likely(queue))
-		queue_work_on(queue_cpu(queue), nvmet_tcp_wq, &queue->io_work);
+	if (queue->data_ready)
+		queue->data_ready(sk);
+	if (likely(queue) &&
+	    queue->state != NVMET_TCP_Q_TLS_HANDSHAKE)
+		queue_work_on(queue_cpu(queue), nvmet_tcp_wq,
+			      &queue->io_work);
 	read_unlock_bh(&sk->sk_callback_lock);
 }
 
@@ -1621,6 +1642,85 @@ static int nvmet_tcp_set_queue_sock(struct nvmet_tcp_queue *queue)
 	return ret;
 }
 
+#ifdef CONFIG_NVME_TARGET_TCP_TLS
+static void nvmet_tcp_tls_queue_reset(struct nvmet_tcp_queue *queue)
+{
+	spin_lock_irq(&queue->state_lock);
+	if (queue->state != NVMET_TCP_Q_TLS_HANDSHAKE) {
+		pr_warn("queue %d: TLS handshake already completed\n",
+			queue->idx);
+		spin_unlock_irq(&queue->state_lock);
+		return;
+	}
+	queue->state = NVMET_TCP_Q_CONNECTING;
+	spin_unlock_irq(&queue->state_lock);
+
+	pr_debug("queue %d: resetting queue callbacks after TLS handshake\n",
+		 queue->idx);
+	/*
+	 * Set callbacks after handshake; TLS implementation
+	 * might have changed the socket callbacks.
+	 */
+	nvmet_tcp_set_queue_sock(queue);
+}
+
+static void nvmet_tcp_tls_handshake_done(void *data, int status,
+					 key_serial_t peerid)
+{
+	struct nvmet_tcp_queue *queue = data;
+
+	pr_debug("queue %d: TLS handshake done, key %x, status %d\n",
+		 queue->idx, peerid, status);
+	if (!status) {
+		spin_lock_irq(&queue->state_lock);
+		queue->tls_pskid = peerid;
+		spin_unlock_irq(&queue->state_lock);
+	}
+	cancel_delayed_work_sync(&queue->tls_handshake_work);
+	if (status)
+		nvmet_tcp_schedule_release_queue(queue);
+	else
+		nvmet_tcp_tls_queue_reset(queue);
+}
+
+static void nvmet_tcp_tls_handshake_timeout_work(struct work_struct *w)
+{
+	struct nvmet_tcp_queue *queue = container_of(to_delayed_work(w),
+			struct nvmet_tcp_queue, tls_handshake_work);
+
+	pr_debug("queue %d: TLS handshake timeout\n", queue->idx);
+	nvmet_tcp_schedule_release_queue(queue);
+}
+
+static int nvmet_tcp_tls_handshake(struct nvmet_tcp_queue *queue)
+{
+	int ret = -EOPNOTSUPP;
+	struct tls_handshake_args args;
+
+	if (queue->state != NVMET_TCP_Q_TLS_HANDSHAKE) {
+		pr_warn("cannot start TLS in state %d\n", queue->state);
+		return -EINVAL;
+	}
+
+	pr_debug("queue %d: TLS ServerHello\n", queue->idx);
+	memset(&args, 0, sizeof(args));
+	args.ta_sock = queue->sock;
+	args.ta_done = nvmet_tcp_tls_handshake_done;
+	args.ta_data = queue;
+	args.ta_keyring = key_serial(queue->port->nport->keyring);
+	args.ta_timeout_ms = tls_handshake_timeout * 2 * 1024;
+
+	ret = tls_server_hello_psk(&args, GFP_KERNEL);
+	if (ret) {
+		pr_err("failed to start TLS, err=%d\n", ret);
+	} else {
+		queue_delayed_work(nvmet_wq, &queue->tls_handshake_work,
+				   tls_handshake_timeout * HZ);
+	}
+	return ret;
+}
+#endif
+
 static void nvmet_tcp_alloc_queue(struct nvmet_tcp_port *port,
 		struct socket *newsock)
 {
@@ -1638,7 +1738,11 @@ static void nvmet_tcp_alloc_queue(struct nvmet_tcp_port *port,
 	queue->port = port;
 	queue->nr_cmds = 0;
 	spin_lock_init(&queue->state_lock);
-	queue->state = NVMET_TCP_Q_CONNECTING;
+	if (queue->port->nport->disc_addr.tsas.tcp.sectype ==
+	    NVMF_TCP_SECTYPE_TLS13)
+		queue->state = NVMET_TCP_Q_TLS_HANDSHAKE;
+	else
+		queue->state = NVMET_TCP_Q_CONNECTING;
 	INIT_LIST_HEAD(&queue->free_list);
 	init_llist_head(&queue->resp_list);
 	INIT_LIST_HEAD(&queue->resp_send_list);
@@ -1669,6 +1773,25 @@ static void nvmet_tcp_alloc_queue(struct nvmet_tcp_port *port,
 	list_add_tail(&queue->queue_list, &nvmet_tcp_queue_list);
 	mutex_unlock(&nvmet_tcp_queue_mutex);
 
+#ifdef CONFIG_NVME_TARGET_TCP_TLS
+	INIT_DELAYED_WORK(&queue->tls_handshake_work,
+			  nvmet_tcp_tls_handshake_timeout_work);
+	if (queue->state == NVMET_TCP_Q_TLS_HANDSHAKE) {
+		struct sock *sk = queue->sock->sk;
+
+		/* Restore the default callbacks before starting upcall */
+		read_lock_bh(&sk->sk_callback_lock);
+		sk->sk_user_data = NULL;
+		sk->sk_data_ready = port->data_ready;
+		read_unlock_bh(&sk->sk_callback_lock);
+		if (!nvmet_tcp_tls_handshake(queue))
+			return;
+
+		/* TLS handshake failed, terminate the connection */
+		goto out_destroy_sq;
+	}
+#endif
+
 	ret = nvmet_tcp_set_queue_sock(queue);
 	if (ret)
 		goto out_destroy_sq;
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 14/14] nvmet-tcp: control messages for recvmsg()
  2023-08-03 10:50 [PATCHv5 00/14] nvme: In-kernel TLS support for TCP Hannes Reinecke
                   ` (12 preceding siblings ...)
  2023-08-03 10:51 ` [PATCH 13/14] nvmet-tcp: enable TLS handshake upcall Hannes Reinecke
@ 2023-08-03 10:51 ` Hannes Reinecke
  13 siblings, 0 replies; 45+ messages in thread
From: Hannes Reinecke @ 2023-08-03 10:51 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Sagi Grimberg, Keith Busch, linux-nvme, Hannes Reinecke

kTLS requires control messages for recvmsg() to relay any out-of-band
TLS messages (eg TLS alerts) to the caller.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/nvme/target/tcp.c | 87 +++++++++++++++++++++++++++++++++------
 1 file changed, 74 insertions(+), 13 deletions(-)

diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
index 7279c994abd6..e2db573d68d9 100644
--- a/drivers/nvme/target/tcp.c
+++ b/drivers/nvme/target/tcp.c
@@ -14,6 +14,7 @@
 #include <net/sock.h>
 #include <net/tcp.h>
 #include <net/tls.h>
+#include <net/tls_prot.h>
 #include <net/handshake.h>
 #include <linux/inet.h>
 #include <linux/llist.h>
@@ -118,6 +119,7 @@ struct nvmet_tcp_cmd {
 	u32				pdu_len;
 	u32				pdu_recv;
 	int				sg_idx;
+	char				recv_cbuf[CMSG_LEN(sizeof(char))];
 	struct msghdr			recv_msg;
 	struct bio_vec			*iov;
 	u32				flags;
@@ -1116,12 +1118,49 @@ static inline bool nvmet_tcp_pdu_valid(u8 type)
 	return false;
 }
 
+static int nvmet_tcp_tls_record_ok(struct socket *sock, struct msghdr *msg, char *cbuf)
+{
+	struct cmsghdr *cmsg = (struct cmsghdr *)cbuf;
+	u8 ctype, level, description;
+	int ret = 0;
+
+	if (!IS_ENABLED(CONFIG_NVME_TARGET_TCP_TLS))
+		return 0;
+
+	ctype = tls_get_record_type(sock->sk, cmsg);
+	switch (ctype) {
+	case 0:
+		break;
+	case TLS_RECORD_TYPE_DATA:
+		break;
+	case TLS_RECORD_TYPE_ALERT:
+		tls_alert_recv(sock->sk, msg, &level, &description);
+		pr_err("TLS Alert level %u desc %u\n", level, description);
+		ret = (level == TLS_ALERT_LEVEL_FATAL) ?
+			-ENOTCONN : -EAGAIN;
+		break;
+	default:
+		/* discard this record type */
+		pr_err("TLS record %d unhandled\n", ctype);
+		ret = -EAGAIN;
+		break;
+	}
+	return ret;
+}
+
 static int nvmet_tcp_try_recv_pdu(struct nvmet_tcp_queue *queue)
 {
 	struct nvme_tcp_hdr *hdr = &queue->pdu.cmd.hdr;
-	int len;
+	int len, ret;
 	struct kvec iov;
-	struct msghdr msg = { .msg_flags = MSG_DONTWAIT };
+	char cbuf[CMSG_LEN(sizeof(char))] = {};
+	struct msghdr msg = {
+#ifdef CONFIG_NVME_TARGET_TCP_TLS
+		.msg_control = cbuf,
+		.msg_controllen = sizeof(cbuf),
+#endif
+		.msg_flags = MSG_DONTWAIT
+	};
 
 recv:
 	iov.iov_base = (void *)&queue->pdu + queue->offset;
@@ -1130,6 +1169,9 @@ static int nvmet_tcp_try_recv_pdu(struct nvmet_tcp_queue *queue)
 			iov.iov_len, msg.msg_flags);
 	if (unlikely(len < 0))
 		return len;
+	ret = nvmet_tcp_tls_record_ok(queue->sock, &msg, cbuf);
+	if (ret < 0)
+		return ret;
 
 	queue->offset += len;
 	queue->left -= len;
@@ -1182,16 +1224,21 @@ static void nvmet_tcp_prep_recv_ddgst(struct nvmet_tcp_cmd *cmd)
 static int nvmet_tcp_try_recv_data(struct nvmet_tcp_queue *queue)
 {
 	struct nvmet_tcp_cmd  *cmd = queue->cmd;
-	int ret;
+	int len, ret;
 
 	while (msg_data_left(&cmd->recv_msg)) {
-		ret = sock_recvmsg(cmd->queue->sock, &cmd->recv_msg,
+		len = sock_recvmsg(cmd->queue->sock, &cmd->recv_msg,
 			cmd->recv_msg.msg_flags);
-		if (ret <= 0)
+		if (len <= 0)
+			return len;
+		ret = nvmet_tcp_tls_record_ok(cmd->queue->sock,
+					      &cmd->recv_msg,
+					      cmd->recv_cbuf);
+		if (ret < 0)
 			return ret;
 
-		cmd->pdu_recv += ret;
-		cmd->rbytes_done += ret;
+		cmd->pdu_recv += len;
+		cmd->rbytes_done += len;
 	}
 
 	if (queue->data_digest) {
@@ -1209,20 +1256,30 @@ static int nvmet_tcp_try_recv_data(struct nvmet_tcp_queue *queue)
 static int nvmet_tcp_try_recv_ddgst(struct nvmet_tcp_queue *queue)
 {
 	struct nvmet_tcp_cmd *cmd = queue->cmd;
-	int ret;
-	struct msghdr msg = { .msg_flags = MSG_DONTWAIT };
+	int ret, len;
+	char cbuf[CMSG_LEN(sizeof(char))] = {};
+	struct msghdr msg = {
+#ifdef CONFIG_NVME_TARGET_TCP_TLS
+		.msg_control = cbuf,
+		.msg_controllen = sizeof(cbuf),
+#endif
+		.msg_flags = MSG_DONTWAIT
+	};
 	struct kvec iov = {
 		.iov_base = (void *)&cmd->recv_ddgst + queue->offset,
 		.iov_len = queue->left
 	};
 
-	ret = kernel_recvmsg(queue->sock, &msg, &iov, 1,
+	len = kernel_recvmsg(queue->sock, &msg, &iov, 1,
 			iov.iov_len, msg.msg_flags);
-	if (unlikely(ret < 0))
+	if (unlikely(len < 0))
+		return len;
+	ret = nvmet_tcp_tls_record_ok(queue->sock, &msg, cbuf);
+	if (ret < 0)
 		return ret;
 
-	queue->offset += ret;
-	queue->left -= ret;
+	queue->offset += len;
+	queue->left -= len;
 	if (queue->left)
 		return -EAGAIN;
 
@@ -1389,6 +1446,10 @@ static int nvmet_tcp_alloc_cmd(struct nvmet_tcp_queue *queue,
 	if (!c->r2t_pdu)
 		goto out_free_data;
 
+	if (IS_ENABLED(CONFIG_NVME_TARGET_TCP_TLS)) {
+		c->recv_msg.msg_control = c->recv_cbuf;
+		c->recv_msg.msg_controllen = sizeof(c->recv_cbuf);
+	}
 	c->recv_msg.msg_flags = MSG_DONTWAIT | MSG_NOSIGNAL;
 
 	list_add_tail(&c->entry, &queue->free_list);
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 01/14] nvme-keyring: register '.nvme' keyring
  2023-08-03 10:50 ` [PATCH 01/14] nvme-keyring: register '.nvme' keyring Hannes Reinecke
@ 2023-08-07  7:09   ` Sagi Grimberg
  0 siblings, 0 replies; 45+ messages in thread
From: Sagi Grimberg @ 2023-08-07  7:09 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig; +Cc: Keith Busch, linux-nvme

Reviewed-by: Sagi Grimberg <sagi@grimberg.me>


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 02/14] nvme-keyring: define a 'psk' keytype
  2023-08-03 10:50 ` [PATCH 02/14] nvme-keyring: define a 'psk' keytype Hannes Reinecke
@ 2023-08-07  7:11   ` Sagi Grimberg
  0 siblings, 0 replies; 45+ messages in thread
From: Sagi Grimberg @ 2023-08-07  7:11 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig; +Cc: Keith Busch, linux-nvme

Reviewed-by: Sagi Grimberg <sagi@grimberg.me>


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 05/14] nvme-keyring: implement nvme_tls_psk_default()
  2023-08-03 10:50 ` [PATCH 05/14] nvme-keyring: implement nvme_tls_psk_default() Hannes Reinecke
@ 2023-08-07  7:13   ` Sagi Grimberg
  0 siblings, 0 replies; 45+ messages in thread
From: Sagi Grimberg @ 2023-08-07  7:13 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig; +Cc: Keith Busch, linux-nvme


> Implement a function to select the preferred PSK for TLS.

I think we should state both in the log message (in addition
to a comment) what is the selection criteria.

Other than that,
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 06/14] security/keys: export key_lookup()
  2023-08-03 10:50 ` [PATCH 06/14] security/keys: export key_lookup() Hannes Reinecke
@ 2023-08-07  7:13   ` Sagi Grimberg
  0 siblings, 0 replies; 45+ messages in thread
From: Sagi Grimberg @ 2023-08-07  7:13 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig; +Cc: Keith Busch, linux-nvme, David Howells

Reviewed-by: Sagi Grimberg <sagi@grimberg.me>


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 07/14] nvme/tcp: allocate socket file
  2023-08-03 10:50 ` [PATCH 07/14] nvme/tcp: allocate socket file Hannes Reinecke
@ 2023-08-07  7:15   ` Sagi Grimberg
  2023-08-07  7:23     ` Hannes Reinecke
  0 siblings, 1 reply; 45+ messages in thread
From: Sagi Grimberg @ 2023-08-07  7:15 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig; +Cc: Keith Busch, linux-nvme



On 8/3/23 13:50, Hannes Reinecke wrote:
> When using the TLS upcall we need to allocate a socket file such
> that the userspace daemon is able to use the socket.
> 
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> ---
>   drivers/nvme/host/tcp.c | 14 ++++++++++++--
>   1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index 9ce417cd32a7..c3b3b938c0f9 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -1338,7 +1338,9 @@ static void nvme_tcp_free_queue(struct nvme_ctrl *nctrl, int qid)
>   	}
>   
>   	noreclaim_flag = memalloc_noreclaim_save();
> -	sock_release(queue->sock);
> +	/* ->sock will be released by fput() */
> +	fput(queue->sock->file);
> +	queue->sock = NULL;
>   	memalloc_noreclaim_restore(noreclaim_flag);
>   
>   	kfree(queue->pdu);
> @@ -1512,6 +1514,7 @@ static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, int qid)
>   	struct nvme_tcp_ctrl *ctrl = to_tcp_ctrl(nctrl);
>   	struct nvme_tcp_queue *queue = &ctrl->queues[qid];
>   	int ret, rcv_pdu_size;
> +	struct file *sock_file;
>   
>   	mutex_init(&queue->queue_lock);
>   	queue->ctrl = ctrl;
> @@ -1534,6 +1537,12 @@ static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, int qid)
>   		goto err_destroy_mutex;
>   	}
>   
> +	sock_file = sock_alloc_file(queue->sock, O_CLOEXEC, NULL);
> +	if (IS_ERR(sock_file)) {
> +		sock_release(queue->sock);
> +		ret = PTR_ERR(sock_file);
> +		goto err_sock;

Is it safe to fput() an error sock file?

> +	}
>   	nvme_tcp_reclassify_socket(queue->sock);
>   
>   	/* Single syn retry */
> @@ -1640,7 +1649,8 @@ static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, int qid)
>   	if (queue->hdr_digest || queue->data_digest)
>   		nvme_tcp_free_crypto(queue);
>   err_sock:
> -	sock_release(queue->sock);
> +	/* ->sock will be released by fput() */
> +	fput(queue->sock->file);
>   	queue->sock = NULL;
>   err_destroy_mutex:
>   	mutex_destroy(&queue->send_mutex);


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 07/14] nvme/tcp: allocate socket file
  2023-08-07  7:15   ` Sagi Grimberg
@ 2023-08-07  7:23     ` Hannes Reinecke
  0 siblings, 0 replies; 45+ messages in thread
From: Hannes Reinecke @ 2023-08-07  7:23 UTC (permalink / raw)
  To: Sagi Grimberg, Christoph Hellwig; +Cc: Keith Busch, linux-nvme

On 8/7/23 09:15, Sagi Grimberg wrote:
> 
> 
> On 8/3/23 13:50, Hannes Reinecke wrote:
>> When using the TLS upcall we need to allocate a socket file such
>> that the userspace daemon is able to use the socket.
>>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>> ---
>>   drivers/nvme/host/tcp.c | 14 ++++++++++++--
>>   1 file changed, 12 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
>> index 9ce417cd32a7..c3b3b938c0f9 100644
>> --- a/drivers/nvme/host/tcp.c
>> +++ b/drivers/nvme/host/tcp.c
>> @@ -1338,7 +1338,9 @@ static void nvme_tcp_free_queue(struct nvme_ctrl 
>> *nctrl, int qid)
>>       }
>>       noreclaim_flag = memalloc_noreclaim_save();
>> -    sock_release(queue->sock);
>> +    /* ->sock will be released by fput() */
>> +    fput(queue->sock->file);
>> +    queue->sock = NULL;
>>       memalloc_noreclaim_restore(noreclaim_flag);
>>       kfree(queue->pdu);
>> @@ -1512,6 +1514,7 @@ static int nvme_tcp_alloc_queue(struct nvme_ctrl 
>> *nctrl, int qid)
>>       struct nvme_tcp_ctrl *ctrl = to_tcp_ctrl(nctrl);
>>       struct nvme_tcp_queue *queue = &ctrl->queues[qid];
>>       int ret, rcv_pdu_size;
>> +    struct file *sock_file;
>>       mutex_init(&queue->queue_lock);
>>       queue->ctrl = ctrl;
>> @@ -1534,6 +1537,12 @@ static int nvme_tcp_alloc_queue(struct 
>> nvme_ctrl *nctrl, int qid)
>>           goto err_destroy_mutex;
>>       }
>> +    sock_file = sock_alloc_file(queue->sock, O_CLOEXEC, NULL);
>> +    if (IS_ERR(sock_file)) {
>> +        sock_release(queue->sock);
>> +        ret = PTR_ERR(sock_file);
>> +        goto err_sock;
> 
> Is it safe to fput() an error sock file?
> 
Probably not. I'll fix it up.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 08/14] nvme-tcp: enable TLS handshake upcall
  2023-08-03 10:50 ` [PATCH 08/14] nvme-tcp: enable TLS handshake upcall Hannes Reinecke
@ 2023-08-07  8:20   ` Sagi Grimberg
  2023-08-07  8:32     ` Hannes Reinecke
  0 siblings, 1 reply; 45+ messages in thread
From: Sagi Grimberg @ 2023-08-07  8:20 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig; +Cc: Keith Busch, linux-nvme



On 8/3/23 13:50, Hannes Reinecke wrote:
> Add a fabrics option 'tls' and start the TLS handshake upcall
> with the default PSK. When TLS is started the PSK key serial
> number is displayed in the sysfs attribute 'tls_key'
> 
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> ---
>   drivers/nvme/host/Kconfig   |  14 ++++
>   drivers/nvme/host/core.c    |   2 +-
>   drivers/nvme/host/fabrics.c |  12 ++++
>   drivers/nvme/host/fabrics.h |   3 +
>   drivers/nvme/host/nvme.h    |   1 +
>   drivers/nvme/host/sysfs.c   |  20 ++++++
>   drivers/nvme/host/tcp.c     | 130 ++++++++++++++++++++++++++++++++++--
>   7 files changed, 175 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/nvme/host/Kconfig b/drivers/nvme/host/Kconfig
> index 2f6a7f8c94e8..96a74041bb0a 100644
> --- a/drivers/nvme/host/Kconfig
> +++ b/drivers/nvme/host/Kconfig
> @@ -92,6 +92,20 @@ config NVME_TCP
>   
>   	  If unsure, say N.
>   
> +config NVME_TCP_TLS
> +	bool "NVMe over Fabrics TCP TLS encryption support"
> +	depends on NVME_TCP
> +	select NVME_COMMON
> +	select NVME_KEYRING
> +	select NET_HANDSHAKE
> +	help
> +	  Enables TLS encryption for NVMe TCP using the netlink handshake API.
> +
> +	  The TLS handshake daemon is availble at
> +	  https://github.com/oracle/ktls-utils.
> +
> +	  If unsure, say N.
> +
>   config NVME_AUTH
>   	bool "NVM Express over Fabrics In-Band Authentication"
>   	depends on NVME_CORE
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index dfc574d0f18d..b52e9c9bffd6 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -4380,7 +4380,7 @@ static void nvme_free_ctrl(struct device *dev)
>   
>   	if (!subsys || ctrl->instance != subsys->instance)
>   		ida_free(&nvme_instance_ida, ctrl->instance);
> -
> +	key_put(ctrl->tls_key);
>   	nvme_free_cels(ctrl);
>   	nvme_mpath_uninit(ctrl);
>   	nvme_auth_stop(ctrl);
> diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
> index 8175d49f2909..ddad482c3537 100644
> --- a/drivers/nvme/host/fabrics.c
> +++ b/drivers/nvme/host/fabrics.c
> @@ -647,6 +647,9 @@ static const match_table_t opt_tokens = {
>   	{ NVMF_OPT_DISCOVERY,		"discovery"		},
>   	{ NVMF_OPT_DHCHAP_SECRET,	"dhchap_secret=%s"	},
>   	{ NVMF_OPT_DHCHAP_CTRL_SECRET,	"dhchap_ctrl_secret=%s"	},
> +#ifdef CONFIG_NVME_TCP_TLS
> +	{ NVMF_OPT_TLS,			"tls"			},
> +#endif
>   	{ NVMF_OPT_ERR,			NULL			}
>   };
>   
> @@ -671,6 +674,7 @@ static int nvmf_parse_options(struct nvmf_ctrl_options *opts,
>   	opts->hdr_digest = false;
>   	opts->data_digest = false;
>   	opts->tos = -1; /* < 0 == use transport default */
> +	opts->tls = false;
>   
>   	options = o = kstrdup(buf, GFP_KERNEL);
>   	if (!options)
> @@ -955,6 +959,14 @@ static int nvmf_parse_options(struct nvmf_ctrl_options *opts,
>   			kfree(opts->dhchap_ctrl_secret);
>   			opts->dhchap_ctrl_secret = p;
>   			break;
> +		case NVMF_OPT_TLS:
> +			if (!IS_ENABLED(CONFIG_NVME_TCP_TLS)) {
> +				pr_err("TLS is not supported\n");
> +				ret = -EINVAL;
> +				goto out;
> +			}
> +			opts->tls = true;
> +			break;
>   		default:
>   			pr_warn("unknown parameter or missing value '%s' in ctrl creation request\n",
>   				p);
> diff --git a/drivers/nvme/host/fabrics.h b/drivers/nvme/host/fabrics.h
> index 82e7a27ffbde..dac17c3fee26 100644
> --- a/drivers/nvme/host/fabrics.h
> +++ b/drivers/nvme/host/fabrics.h
> @@ -70,6 +70,7 @@ enum {
>   	NVMF_OPT_DISCOVERY	= 1 << 22,
>   	NVMF_OPT_DHCHAP_SECRET	= 1 << 23,
>   	NVMF_OPT_DHCHAP_CTRL_SECRET = 1 << 24,
> +	NVMF_OPT_TLS		= 1 << 25,
>   };
>   
>   /**
> @@ -102,6 +103,7 @@ enum {
>    * @dhchap_secret: DH-HMAC-CHAP secret
>    * @dhchap_ctrl_secret: DH-HMAC-CHAP controller secret for bi-directional
>    *              authentication
> + * @tls:        Start TLS encrypted connections (TCP)
>    * @disable_sqflow: disable controller sq flow control
>    * @hdr_digest: generate/verify header digest (TCP)
>    * @data_digest: generate/verify data digest (TCP)
> @@ -128,6 +130,7 @@ struct nvmf_ctrl_options {
>   	struct nvmf_host	*host;
>   	char			*dhchap_secret;
>   	char			*dhchap_ctrl_secret;
> +	bool			tls;
>   	bool			disable_sqflow;
>   	bool			hdr_digest;
>   	bool			data_digest;
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index f35647c470af..6fe7966f720b 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -357,6 +357,7 @@ struct nvme_ctrl {
>   	struct nvme_dhchap_key *ctrl_key;
>   	u16 transaction;
>   #endif
> +	struct key *tls_key;
>   
>   	/* Power saving configuration */
>   	u64 ps_max_latency_us;
> diff --git a/drivers/nvme/host/sysfs.c b/drivers/nvme/host/sysfs.c
> index 212e1b05d298..03a64674d0a4 100644
> --- a/drivers/nvme/host/sysfs.c
> +++ b/drivers/nvme/host/sysfs.c
> @@ -527,6 +527,19 @@ static DEVICE_ATTR(dhchap_ctrl_secret, S_IRUGO | S_IWUSR,
>   	nvme_ctrl_dhchap_ctrl_secret_show, nvme_ctrl_dhchap_ctrl_secret_store);
>   #endif
>   
> +#ifdef CONFIG_NVME_TCP_TLS
> +static ssize_t tls_key_show(struct device *dev,
> +			    struct device_attribute *attr, char *buf)
> +{
> +	struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
> +
> +	if (!ctrl->tls_key)
> +		return 0;
> +	return sysfs_emit(buf, "%08x", key_serial(ctrl->tls_key));
> +}
> +static DEVICE_ATTR_RO(tls_key);
> +#endif
> +
>   static struct attribute *nvme_dev_attrs[] = {
>   	&dev_attr_reset_controller.attr,
>   	&dev_attr_rescan_controller.attr,
> @@ -553,6 +566,9 @@ static struct attribute *nvme_dev_attrs[] = {
>   #ifdef CONFIG_NVME_AUTH
>   	&dev_attr_dhchap_secret.attr,
>   	&dev_attr_dhchap_ctrl_secret.attr,
> +#endif
> +#ifdef CONFIG_NVME_TCP_TLS
> +	&dev_attr_tls_key.attr,
>   #endif
>   	NULL
>   };
> @@ -583,6 +599,10 @@ static umode_t nvme_dev_attrs_are_visible(struct kobject *kobj,
>   	if (a == &dev_attr_dhchap_ctrl_secret.attr && !ctrl->opts)
>   		return 0;
>   #endif
> +#ifdef CONFIG_NVME_TCP_TLS
> +	if (a == &dev_attr_tls_key.attr && !ctrl->opts)
> +		return 0;
> +#endif

Probably you want to check that the transport is tcp.

>   
>   	return a->mode;
>   }
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index c3b3b938c0f9..98400ad5f969 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -8,9 +8,13 @@
>   #include <linux/init.h>
>   #include <linux/slab.h>
>   #include <linux/err.h>
> +#include <linux/key.h>
>   #include <linux/nvme-tcp.h>
> +#include <linux/nvme-keyring.h>
>   #include <net/sock.h>
>   #include <net/tcp.h>
> +#include <net/tls.h>
> +#include <net/handshake.h>
>   #include <linux/blk-mq.h>
>   #include <crypto/hash.h>
>   #include <net/busy_poll.h>
> @@ -31,6 +35,16 @@ static int so_priority;
>   module_param(so_priority, int, 0644);
>   MODULE_PARM_DESC(so_priority, "nvme tcp socket optimize priority");
>   
> +#ifdef CONFIG_NVME_TCP_TLS
> +/*
> + * TLS handshake timeout
> + */
> +static int tls_handshake_timeout = 10;
> +module_param(tls_handshake_timeout, int, 0644);
> +MODULE_PARM_DESC(tls_handshake_timeout,
> +		 "nvme TLS handshake timeout in seconds (default 10)");
> +#endif
> +
>   #ifdef CONFIG_DEBUG_LOCK_ALLOC
>   /* lockdep can detect a circular dependency of the form
>    *   sk_lock -> mmap_lock (page fault) -> fs locks -> sk_lock
> @@ -146,7 +160,10 @@ struct nvme_tcp_queue {
>   	struct ahash_request	*snd_hash;
>   	__le32			exp_ddgst;
>   	__le32			recv_ddgst;
> -
> +#ifdef CONFIG_NVME_TCP_TLS
> +	struct completion       tls_complete;
> +	int                     tls_err;
> +#endif
>   	struct page_frag_cache	pf_cache;
>   
>   	void (*state_change)(struct sock *);
> @@ -1509,7 +1526,80 @@ static void nvme_tcp_set_queue_io_cpu(struct nvme_tcp_queue *queue)
>   	queue->io_cpu = cpumask_next_wrap(n - 1, cpu_online_mask, -1, false);
>   }
>   
> -static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, int qid)
> +#ifdef CONFIG_NVME_TCP_TLS
> +static void nvme_tcp_tls_done(void *data, int status, key_serial_t pskid)
> +{
> +	struct nvme_tcp_queue *queue = data;
> +	struct nvme_tcp_ctrl *ctrl = queue->ctrl;
> +	int qid = nvme_tcp_queue_id(queue);
> +	struct key *tls_key;
> +
> +	dev_dbg(ctrl->ctrl.device, "queue %d: TLS handshake done, key %x, status %d\n",
> +		qid, pskid, status);
> +
> +	if (status) {
> +		queue->tls_err = -status;
> +		goto out_complete;
> +	}
> +
> +	tls_key = key_lookup(pskid);
> +	if (IS_ERR(tls_key)) {
> +		dev_warn(ctrl->ctrl.device, "queue %d: Invalid key %x\n",
> +			 qid, pskid);
> +		queue->tls_err = -ENOKEY;
> +	} else {
> +		ctrl->ctrl.tls_key = tls_key;
> +		queue->tls_err = 0;
> +	}
> +
> +out_complete:
> +	complete(&queue->tls_complete);
> +}
> +
> +static int nvme_tcp_start_tls(struct nvme_ctrl *nctrl,
> +			      struct nvme_tcp_queue *queue,
> +			      key_serial_t pskid)
> +{
> +	int qid = nvme_tcp_queue_id(queue);
> +	int ret;
> +	struct tls_handshake_args args;
> +	unsigned long tmo = tls_handshake_timeout * HZ;
> +	key_serial_t keyring = nvme_keyring_id();
> +
> +	dev_dbg(nctrl->device, "queue %d: start TLS with key %x\n",
> +		qid, pskid);
> +	memset(&args, 0, sizeof(args));
> +	args.ta_sock = queue->sock;
> +	args.ta_done = nvme_tcp_tls_done;
> +	args.ta_data = queue;
> +	args.ta_my_peerids[0] = pskid;
> +	args.ta_num_peerids = 1;
> +	args.ta_keyring = keyring;
> +	args.ta_timeout_ms = tls_handshake_timeout * 1000;
> +	queue->tls_err = -EOPNOTSUPP;
> +	init_completion(&queue->tls_complete);
> +	ret = tls_client_hello_psk(&args, GFP_KERNEL);
> +	if (ret) {
> +		dev_err(nctrl->device, "queue %d: failed to start TLS: %d\n",
> +			qid, ret);
> +		return ret;
> +	}
> +	if (wait_for_completion_timeout(&queue->tls_complete, tmo) == 0) {

Maybe this should be interruptible as well?

> +		dev_err(nctrl->device,
> +			"queue %d: TLS handshake timeout\n", qid);
> +		ret = -ETIMEDOUT;
> +	} else {
> +		dev_dbg(nctrl->device,
> +			"queue %d: TLS handshake complete, error %d\n",
> +			qid, queue->tls_err);
> +		ret = queue->tls_err;
> +	}
> +	return ret;
> +}
> +#endif
> +
> +static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, int qid,
> +				key_serial_t pskid)
>   {
>   	struct nvme_tcp_ctrl *ctrl = to_tcp_ctrl(nctrl);
>   	struct nvme_tcp_queue *queue = &ctrl->queues[qid];
> @@ -1633,6 +1723,14 @@ static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, int qid)
>   		goto err_rcv_pdu;
>   	}
>   
> +#ifdef CONFIG_NVME_TCP_TLS

pskid is passed unconditionally, is the ifdef needed?
maybe make nvme_tcp_start_tls stub if not defined?

> +	/* If PSKs are configured try to start TLS */
> +	if (pskid) {
> +		ret = nvme_tcp_start_tls(nctrl, queue, pskid);
> +		if (ret)
> +			goto err_init_connect;
> +	}
> +#endif
>   	ret = nvme_tcp_init_connection(queue);
>   	if (ret)
>   		goto err_init_connect;
> @@ -1782,10 +1880,22 @@ static int nvme_tcp_start_io_queues(struct nvme_ctrl *ctrl,
>   static int nvme_tcp_alloc_admin_queue(struct nvme_ctrl *ctrl)
>   {
>   	int ret;
> +	key_serial_t psk_id = 0;
> +
> +	if (ctrl->opts->tls) {
> +		psk_id = nvme_tls_psk_default(NULL,
> +					      ctrl->opts->host->nqn,
> +					      ctrl->opts->subsysnqn);

is the returned tls_key_id guaranteed to be non-zero?

> +		if (!psk_id) {
> +			dev_err(ctrl->device, "no valid PSK found\n");
> +			ret = -ENOKEY;
> +			goto out_free_queue;
> +		}
> +	}
>   
> -	ret = nvme_tcp_alloc_queue(ctrl, 0);
> +	ret = nvme_tcp_alloc_queue(ctrl, 0, psk_id);
>   	if (ret)
> -		return ret;
> +		goto out_free_queue;
>   
>   	ret = nvme_tcp_alloc_async_req(to_tcp_ctrl(ctrl));
>   	if (ret)
> @@ -1801,9 +1911,17 @@ static int nvme_tcp_alloc_admin_queue(struct nvme_ctrl *ctrl)
>   static int __nvme_tcp_alloc_io_queues(struct nvme_ctrl *ctrl)
>   {
>   	int i, ret;
> +	key_serial_t psk_id = 0;
>   
> +	if (ctrl->opts->tls) {
> +		if (!ctrl->tls_key) {
> +			dev_err(ctrl->device, "no PSK negotiated\n");
> +			return -ENOKEY;
> +		}
> +		psk_id = key_serial(ctrl->tls_key);
> +	}
>   	for (i = 1; i < ctrl->queue_count; i++) {
> -		ret = nvme_tcp_alloc_queue(ctrl, i);
> +		ret = nvme_tcp_alloc_queue(ctrl, i, psk_id);
>   		if (ret)
>   			goto out_free_queues;
>   	}
> @@ -2630,7 +2748,7 @@ static struct nvmf_transport_ops nvme_tcp_transport = {
>   			  NVMF_OPT_HOST_TRADDR | NVMF_OPT_CTRL_LOSS_TMO |
>   			  NVMF_OPT_HDR_DIGEST | NVMF_OPT_DATA_DIGEST |
>   			  NVMF_OPT_NR_WRITE_QUEUES | NVMF_OPT_NR_POLL_QUEUES |
> -			  NVMF_OPT_TOS | NVMF_OPT_HOST_IFACE,
> +			  NVMF_OPT_TOS | NVMF_OPT_HOST_IFACE | NVMF_OPT_TLS,
>   	.create_ctrl	= nvme_tcp_create_ctrl,
>   };
>   


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 09/14] nvme-tcp: control message handling for recvmsg()
  2023-08-03 10:50 ` [PATCH 09/14] nvme-tcp: control message handling for recvmsg() Hannes Reinecke
@ 2023-08-07  8:22   ` Sagi Grimberg
  2023-08-08  6:39     ` Hannes Reinecke
  0 siblings, 1 reply; 45+ messages in thread
From: Sagi Grimberg @ 2023-08-07  8:22 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig; +Cc: Keith Busch, linux-nvme



On 8/3/23 13:50, Hannes Reinecke wrote:
> kTLS is sending TLS ALERT messages as control messages for recvmsg().
> As we can't do anything sensible with it just abort the connection
> and let the userspace agent to a re-negotiation.
> 
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
> ---
>   drivers/nvme/host/tcp.c | 18 +++++++++++++++++-
>   1 file changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index 98400ad5f969..4d0e3de39c26 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -14,6 +14,7 @@
>   #include <net/sock.h>
>   #include <net/tcp.h>
>   #include <net/tls.h>
> +#include <net/tls_prot.h>
>   #include <net/handshake.h>
>   #include <linux/blk-mq.h>
>   #include <crypto/hash.h>
> @@ -1369,6 +1370,10 @@ static int nvme_tcp_init_connection(struct nvme_tcp_queue *queue)
>   {
>   	struct nvme_tcp_icreq_pdu *icreq;
>   	struct nvme_tcp_icresp_pdu *icresp;
> +#ifdef CONFIG_NVME_TCP_TLS
> +	char cbuf[CMSG_LEN(sizeof(char))] = {};
> +	u8 ctype;
> +#endif
>   	struct msghdr msg = {};
>   	struct kvec iov;
>   	bool ctrl_hdgst, ctrl_ddgst;
> @@ -1406,11 +1411,22 @@ static int nvme_tcp_init_connection(struct nvme_tcp_queue *queue)
>   	memset(&msg, 0, sizeof(msg));
>   	iov.iov_base = icresp;
>   	iov.iov_len = sizeof(*icresp);
> +#ifdef CONFIG_NVME_TCP_TLS
> +	msg.msg_control = cbuf;
> +	msg.msg_controllen = sizeof(cbuf);
> +#endif
>   	ret = kernel_recvmsg(queue->sock, &msg, &iov, 1,
>   			iov.iov_len, msg.msg_flags);
>   	if (ret < 0)
>   		goto free_icresp;

Is the cmsg passed unconditionally? meaning this does not
impact non-tls?

If so, why is the ifdef needed?

> -
> +#ifdef CONFIG_NVME_TCP_TLS
> +	ctype = tls_get_record_type(queue->sock->sk, (struct cmsghdr *)cbuf);
> +	if (ctype != TLS_RECORD_TYPE_DATA) {
> +		pr_err("queue %d: unhandled TLS record %d\n",
> +		       nvme_tcp_queue_id(queue), ctype);
> +		return -ENOTCONN;
> +	}
> +#endif



>   	ret = -EINVAL;
>   	if (icresp->hdr.type != nvme_tcp_icresp) {
>   		pr_err("queue %d: bad type returned %d\n",


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 10/14] nvme-fabrics: parse options 'keyring' and 'tls_key'
  2023-08-03 10:50 ` [PATCH 10/14] nvme-fabrics: parse options 'keyring' and 'tls_key' Hannes Reinecke
@ 2023-08-07  8:23   ` Sagi Grimberg
  2023-08-07  8:34     ` Hannes Reinecke
  0 siblings, 1 reply; 45+ messages in thread
From: Sagi Grimberg @ 2023-08-07  8:23 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig; +Cc: Keith Busch, linux-nvme



On 8/3/23 13:50, Hannes Reinecke wrote:
> Parse the fabrics options 'keyring' and 'tls_key' and store the
> referenced keys in the options structure.
> 
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> ---
>   drivers/nvme/host/fabrics.c | 69 ++++++++++++++++++++++++++++++++++++-
>   drivers/nvme/host/fabrics.h |  6 ++++
>   drivers/nvme/host/tcp.c     | 11 ++++--
>   3 files changed, 82 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
> index ddad482c3537..6bdcd505a477 100644
> --- a/drivers/nvme/host/fabrics.c
> +++ b/drivers/nvme/host/fabrics.c
> @@ -643,6 +643,10 @@ static const match_table_t opt_tokens = {
>   	{ NVMF_OPT_NR_WRITE_QUEUES,	"nr_write_queues=%d"	},
>   	{ NVMF_OPT_NR_POLL_QUEUES,	"nr_poll_queues=%d"	},
>   	{ NVMF_OPT_TOS,			"tos=%d"		},
> +#ifdef CONFIG_NVME_TCP_TLS
> +	{ NVMF_OPT_KEYRING,		"keyring=%d"		},
> +	{ NVMF_OPT_TLS_KEY,		"tls_key=%d"		},
> +#endif
>   	{ NVMF_OPT_FAIL_FAST_TMO,	"fast_io_fail_tmo=%d"	},
>   	{ NVMF_OPT_DISCOVERY,		"discovery"		},
>   	{ NVMF_OPT_DHCHAP_SECRET,	"dhchap_secret=%s"	},
> @@ -660,9 +664,10 @@ static int nvmf_parse_options(struct nvmf_ctrl_options *opts,
>   	char *options, *o, *p;
>   	int token, ret = 0;
>   	size_t nqnlen  = 0;
> -	int ctrl_loss_tmo = NVMF_DEF_CTRL_LOSS_TMO;
> +	int ctrl_loss_tmo = NVMF_DEF_CTRL_LOSS_TMO, key_id;
>   	uuid_t hostid;
>   	char hostnqn[NVMF_NQN_SIZE];
> +	struct key *key = NULL;
>   
>   	/* Set defaults */
>   	opts->queue_size = NVMF_DEF_QUEUE_SIZE;
> @@ -928,6 +933,66 @@ static int nvmf_parse_options(struct nvmf_ctrl_options *opts,
>   			}
>   			opts->tos = token;
>   			break;
> +		case NVMF_OPT_KEYRING:
> +			if (!IS_ENABLED(CONFIG_NVME_TCP_TLS)) {
> +				pr_err("TLS is not supported\n");
> +				ret = -EINVAL;
> +				goto out;
> +			}
> +			if (match_int(args, &key_id)) {
> +				ret = -EINVAL;
> +				goto out;
> +			}
> +			if (key_id < 0) {
> +				pr_err("Invalid keyring id %d\n", key_id);
> +				ret = -EINVAL;
> +				goto out;
> +			}
> +			if (!key_id) {
> +				pr_debug("Using default keyring\n");
> +				key_put(opts->keyring);
> +				opts->keyring = NULL;
> +				break;
> +			}
> +			key = key_lookup(key_id);
> +			if (!key) {
> +				pr_err("Keyring id %08x not found\n", key_id);
> +				ret = -ENOKEY;
> +				goto out;
> +			}
> +			key_put(opts->keyring);
> +			opts->keyring = key;
> +			break;
> +		case NVMF_OPT_TLS_KEY:
> +			if (!IS_ENABLED(CONFIG_NVME_TCP_TLS)) {
> +				pr_err("TLS is not supported\n");
> +				ret = -EINVAL;
> +				goto out;
> +			}
> +			if (match_int(args, &key_id)) {
> +				ret = -EINVAL;
> +				goto out;
> +			}
> +			if (key_id < 0) {
> +				pr_err("Invalid key id %d\n", key_id);
> +				ret = -EINVAL;
> +				goto out;
> +			}
> +			if (!key_id) {
> +				pr_debug("Using 'best' PSK\n");
> +				key_put(opts->tls_key);
> +				opts->tls_key = NULL;
> +				break;
> +			}
> +			key = key_lookup(key_id);
> +			if (!key) {
> +				pr_err("Key id %08x not found\n", key_id);
> +				ret = -ENOKEY;
> +				goto out;
> +			}
> +			key_put(opts->tls_key);
> +			opts->tls_key = key;
> +			break;

Maybe move to helper(s)? looks pretty similar.

>   		case NVMF_OPT_DISCOVERY:
>   			opts->discovery_nqn = true;
>   			break;
> @@ -1168,6 +1233,8 @@ static int nvmf_check_allowed_opts(struct nvmf_ctrl_options *opts,
>   void nvmf_free_options(struct nvmf_ctrl_options *opts)
>   {
>   	nvmf_host_put(opts->host);
> +	key_put(opts->keyring);
> +	key_put(opts->tls_key);
>   	kfree(opts->transport);
>   	kfree(opts->traddr);
>   	kfree(opts->trsvcid);
> diff --git a/drivers/nvme/host/fabrics.h b/drivers/nvme/host/fabrics.h
> index dac17c3fee26..fbaee5a7be19 100644
> --- a/drivers/nvme/host/fabrics.h
> +++ b/drivers/nvme/host/fabrics.h
> @@ -71,6 +71,8 @@ enum {
>   	NVMF_OPT_DHCHAP_SECRET	= 1 << 23,
>   	NVMF_OPT_DHCHAP_CTRL_SECRET = 1 << 24,
>   	NVMF_OPT_TLS		= 1 << 25,
> +	NVMF_OPT_KEYRING	= 1 << 26,
> +	NVMF_OPT_TLS_KEY	= 1 << 27,
>   };
>   
>   /**
> @@ -103,6 +105,8 @@ enum {
>    * @dhchap_secret: DH-HMAC-CHAP secret
>    * @dhchap_ctrl_secret: DH-HMAC-CHAP controller secret for bi-directional
>    *              authentication
> + * @keyring:    Keyring to use for key lookups
> + * @tls_key:    TLS key for encrypted connections (TCP)
>    * @tls:        Start TLS encrypted connections (TCP)
>    * @disable_sqflow: disable controller sq flow control
>    * @hdr_digest: generate/verify header digest (TCP)
> @@ -130,6 +134,8 @@ struct nvmf_ctrl_options {
>   	struct nvmf_host	*host;
>   	char			*dhchap_secret;
>   	char			*dhchap_ctrl_secret;
> +	struct key		*keyring;
> +	struct key		*tls_key;
>   	bool			tls;
>   	bool			disable_sqflow;
>   	bool			hdr_digest;
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index 4d0e3de39c26..cc6a672b8d3f 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -1584,6 +1584,8 @@ static int nvme_tcp_start_tls(struct nvme_ctrl *nctrl,
>   
>   	dev_dbg(nctrl->device, "queue %d: start TLS with key %x\n",
>   		qid, pskid);
> +	if (nctrl->opts->keyring)
> +		keyring = key_serial(nctrl->opts->keyring);
>   	memset(&args, 0, sizeof(args));
>   	args.ta_sock = queue->sock;
>   	args.ta_done = nvme_tcp_tls_done;
> @@ -1899,9 +1901,12 @@ static int nvme_tcp_alloc_admin_queue(struct nvme_ctrl *ctrl)
>   	key_serial_t psk_id = 0;
>   
>   	if (ctrl->opts->tls) {
> -		psk_id = nvme_tls_psk_default(NULL,
> -					      ctrl->opts->host->nqn,
> -					      ctrl->opts->subsysnqn);
> +		if (ctrl->opts->tls_key)
> +			psk_id = key_serial(ctrl->opts->tls_key);
> +		else
> +			psk_id = nvme_tls_psk_default(ctrl->opts->keyring,
> +						      ctrl->opts->host->nqn,
> +						      ctrl->opts->subsysnqn);
>   		if (!psk_id) {
>   			dev_err(ctrl->device, "no valid PSK found\n");
>   			ret = -ENOKEY;


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 11/14] nvmet: make TCP sectype settable via configfs
  2023-08-03 10:50 ` [PATCH 11/14] nvmet: make TCP sectype settable via configfs Hannes Reinecke
@ 2023-08-07  8:25   ` Sagi Grimberg
  0 siblings, 0 replies; 45+ messages in thread
From: Sagi Grimberg @ 2023-08-07  8:25 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig; +Cc: Keith Busch, linux-nvme

Reviewed-by: Sagi Grimberg <sagi@grimberg.me>


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 12/14] nvmet-tcp: allocate socket file
  2023-08-03 10:51 ` [PATCH 12/14] nvmet-tcp: allocate socket file Hannes Reinecke
@ 2023-08-07  8:27   ` Sagi Grimberg
  2023-08-07  8:49     ` Hannes Reinecke
  0 siblings, 1 reply; 45+ messages in thread
From: Sagi Grimberg @ 2023-08-07  8:27 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig; +Cc: Keith Busch, linux-nvme



On 8/3/23 13:51, Hannes Reinecke wrote:
> For the TLS upcall we need to allocate a socket file such
> that the userspace daemon is able to use the socket.
> 
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> ---
>   drivers/nvme/target/tcp.c | 29 +++++++++++++++++------------
>   1 file changed, 17 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
> index 868aa4de2e4c..fdc351f591a4 100644
> --- a/drivers/nvme/target/tcp.c
> +++ b/drivers/nvme/target/tcp.c
> @@ -1493,12 +1493,12 @@ static void nvmet_tcp_release_queue_work(struct work_struct *w)
>   	nvmet_sq_destroy(&queue->nvme_sq);
>   	cancel_work_sync(&queue->io_work);
>   	nvmet_tcp_free_cmd_data_in_buffers(queue);
> -	sock_release(queue->sock);
> +	/* ->sock will be released by fput() */
> +	fput(queue->sock->file);
>   	nvmet_tcp_free_cmds(queue);
>   	if (queue->hdr_digest || queue->data_digest)
>   		nvmet_tcp_free_crypto(queue);
>   	ida_free(&nvmet_tcp_queue_ida, queue->idx);
> -
>   	page = virt_to_head_page(queue->pf_cache.va);
>   	__page_frag_cache_drain(page, queue->pf_cache.pagecnt_bias);
>   	kfree(queue);
> @@ -1621,15 +1621,16 @@ static int nvmet_tcp_set_queue_sock(struct nvmet_tcp_queue *queue)
>   	return ret;
>   }
>   
> -static int nvmet_tcp_alloc_queue(struct nvmet_tcp_port *port,
> +static void nvmet_tcp_alloc_queue(struct nvmet_tcp_port *port,
>   		struct socket *newsock)

Why does the function change from retcode to void in this patch?


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 08/14] nvme-tcp: enable TLS handshake upcall
  2023-08-07  8:20   ` Sagi Grimberg
@ 2023-08-07  8:32     ` Hannes Reinecke
  0 siblings, 0 replies; 45+ messages in thread
From: Hannes Reinecke @ 2023-08-07  8:32 UTC (permalink / raw)
  To: Sagi Grimberg, Christoph Hellwig; +Cc: Keith Busch, linux-nvme

On 8/7/23 10:20, Sagi Grimberg wrote:
> 
> 
> On 8/3/23 13:50, Hannes Reinecke wrote:
>> Add a fabrics option 'tls' and start the TLS handshake upcall
>> with the default PSK. When TLS is started the PSK key serial
>> number is displayed in the sysfs attribute 'tls_key'
>>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>> ---
>>   drivers/nvme/host/Kconfig   |  14 ++++
>>   drivers/nvme/host/core.c    |   2 +-
>>   drivers/nvme/host/fabrics.c |  12 ++++
>>   drivers/nvme/host/fabrics.h |   3 +
>>   drivers/nvme/host/nvme.h    |   1 +
>>   drivers/nvme/host/sysfs.c   |  20 ++++++
>>   drivers/nvme/host/tcp.c     | 130 ++++++++++++++++++++++++++++++++++--
>>   7 files changed, 175 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/nvme/host/Kconfig b/drivers/nvme/host/Kconfig
>> index 2f6a7f8c94e8..96a74041bb0a 100644
>> --- a/drivers/nvme/host/Kconfig
>> +++ b/drivers/nvme/host/Kconfig
>> @@ -92,6 +92,20 @@ config NVME_TCP
>>         If unsure, say N.
>> +config NVME_TCP_TLS
>> +    bool "NVMe over Fabrics TCP TLS encryption support"
>> +    depends on NVME_TCP
>> +    select NVME_COMMON
>> +    select NVME_KEYRING
>> +    select NET_HANDSHAKE
>> +    help
>> +      Enables TLS encryption for NVMe TCP using the netlink handshake 
>> API.
>> +
>> +      The TLS handshake daemon is availble at
>> +      https://github.com/oracle/ktls-utils.
>> +
>> +      If unsure, say N.
>> +
>>   config NVME_AUTH
>>       bool "NVM Express over Fabrics In-Band Authentication"
>>       depends on NVME_CORE
>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> index dfc574d0f18d..b52e9c9bffd6 100644
>> --- a/drivers/nvme/host/core.c
>> +++ b/drivers/nvme/host/core.c
>> @@ -4380,7 +4380,7 @@ static void nvme_free_ctrl(struct device *dev)
>>       if (!subsys || ctrl->instance != subsys->instance)
>>           ida_free(&nvme_instance_ida, ctrl->instance);
>> -
>> +    key_put(ctrl->tls_key);
>>       nvme_free_cels(ctrl);
>>       nvme_mpath_uninit(ctrl);
>>       nvme_auth_stop(ctrl);
>> diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
>> index 8175d49f2909..ddad482c3537 100644
>> --- a/drivers/nvme/host/fabrics.c
>> +++ b/drivers/nvme/host/fabrics.c
>> @@ -647,6 +647,9 @@ static const match_table_t opt_tokens = {
>>       { NVMF_OPT_DISCOVERY,        "discovery"        },
>>       { NVMF_OPT_DHCHAP_SECRET,    "dhchap_secret=%s"    },
>>       { NVMF_OPT_DHCHAP_CTRL_SECRET,    "dhchap_ctrl_secret=%s"    },
>> +#ifdef CONFIG_NVME_TCP_TLS
>> +    { NVMF_OPT_TLS,            "tls"            },
>> +#endif
>>       { NVMF_OPT_ERR,            NULL            }
>>   };
>> @@ -671,6 +674,7 @@ static int nvmf_parse_options(struct 
>> nvmf_ctrl_options *opts,
>>       opts->hdr_digest = false;
>>       opts->data_digest = false;
>>       opts->tos = -1; /* < 0 == use transport default */
>> +    opts->tls = false;
>>       options = o = kstrdup(buf, GFP_KERNEL);
>>       if (!options)
>> @@ -955,6 +959,14 @@ static int nvmf_parse_options(struct 
>> nvmf_ctrl_options *opts,
>>               kfree(opts->dhchap_ctrl_secret);
>>               opts->dhchap_ctrl_secret = p;
>>               break;
>> +        case NVMF_OPT_TLS:
>> +            if (!IS_ENABLED(CONFIG_NVME_TCP_TLS)) {
>> +                pr_err("TLS is not supported\n");
>> +                ret = -EINVAL;
>> +                goto out;
>> +            }
>> +            opts->tls = true;
>> +            break;
>>           default:
>>               pr_warn("unknown parameter or missing value '%s' in ctrl 
>> creation request\n",
>>                   p);
>> diff --git a/drivers/nvme/host/fabrics.h b/drivers/nvme/host/fabrics.h
>> index 82e7a27ffbde..dac17c3fee26 100644
>> --- a/drivers/nvme/host/fabrics.h
>> +++ b/drivers/nvme/host/fabrics.h
>> @@ -70,6 +70,7 @@ enum {
>>       NVMF_OPT_DISCOVERY    = 1 << 22,
>>       NVMF_OPT_DHCHAP_SECRET    = 1 << 23,
>>       NVMF_OPT_DHCHAP_CTRL_SECRET = 1 << 24,
>> +    NVMF_OPT_TLS        = 1 << 25,
>>   };
>>   /**
>> @@ -102,6 +103,7 @@ enum {
>>    * @dhchap_secret: DH-HMAC-CHAP secret
>>    * @dhchap_ctrl_secret: DH-HMAC-CHAP controller secret for 
>> bi-directional
>>    *              authentication
>> + * @tls:        Start TLS encrypted connections (TCP)
>>    * @disable_sqflow: disable controller sq flow control
>>    * @hdr_digest: generate/verify header digest (TCP)
>>    * @data_digest: generate/verify data digest (TCP)
>> @@ -128,6 +130,7 @@ struct nvmf_ctrl_options {
>>       struct nvmf_host    *host;
>>       char            *dhchap_secret;
>>       char            *dhchap_ctrl_secret;
>> +    bool            tls;
>>       bool            disable_sqflow;
>>       bool            hdr_digest;
>>       bool            data_digest;
>> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
>> index f35647c470af..6fe7966f720b 100644
>> --- a/drivers/nvme/host/nvme.h
>> +++ b/drivers/nvme/host/nvme.h
>> @@ -357,6 +357,7 @@ struct nvme_ctrl {
>>       struct nvme_dhchap_key *ctrl_key;
>>       u16 transaction;
>>   #endif
>> +    struct key *tls_key;
>>       /* Power saving configuration */
>>       u64 ps_max_latency_us;
>> diff --git a/drivers/nvme/host/sysfs.c b/drivers/nvme/host/sysfs.c
>> index 212e1b05d298..03a64674d0a4 100644
>> --- a/drivers/nvme/host/sysfs.c
>> +++ b/drivers/nvme/host/sysfs.c
>> @@ -527,6 +527,19 @@ static DEVICE_ATTR(dhchap_ctrl_secret, S_IRUGO | 
>> S_IWUSR,
>>       nvme_ctrl_dhchap_ctrl_secret_show, 
>> nvme_ctrl_dhchap_ctrl_secret_store);
>>   #endif
>> +#ifdef CONFIG_NVME_TCP_TLS
>> +static ssize_t tls_key_show(struct device *dev,
>> +                struct device_attribute *attr, char *buf)
>> +{
>> +    struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
>> +
>> +    if (!ctrl->tls_key)
>> +        return 0;
>> +    return sysfs_emit(buf, "%08x", key_serial(ctrl->tls_key));
>> +}
>> +static DEVICE_ATTR_RO(tls_key);
>> +#endif
>> +
>>   static struct attribute *nvme_dev_attrs[] = {
>>       &dev_attr_reset_controller.attr,
>>       &dev_attr_rescan_controller.attr,
>> @@ -553,6 +566,9 @@ static struct attribute *nvme_dev_attrs[] = {
>>   #ifdef CONFIG_NVME_AUTH
>>       &dev_attr_dhchap_secret.attr,
>>       &dev_attr_dhchap_ctrl_secret.attr,
>> +#endif
>> +#ifdef CONFIG_NVME_TCP_TLS
>> +    &dev_attr_tls_key.attr,
>>   #endif
>>       NULL
>>   };
>> @@ -583,6 +599,10 @@ static umode_t nvme_dev_attrs_are_visible(struct 
>> kobject *kobj,
>>       if (a == &dev_attr_dhchap_ctrl_secret.attr && !ctrl->opts)
>>           return 0;
>>   #endif
>> +#ifdef CONFIG_NVME_TCP_TLS
>> +    if (a == &dev_attr_tls_key.attr && !ctrl->opts)
>> +        return 0;
>> +#endif
> 
> Probably you want to check that the transport is tcp.
> 
Probably.

>>       return a->mode;
>>   }
>> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
>> index c3b3b938c0f9..98400ad5f969 100644
>> --- a/drivers/nvme/host/tcp.c
>> +++ b/drivers/nvme/host/tcp.c
>> @@ -8,9 +8,13 @@
>>   #include <linux/init.h>
>>   #include <linux/slab.h>
>>   #include <linux/err.h>
>> +#include <linux/key.h>
>>   #include <linux/nvme-tcp.h>
>> +#include <linux/nvme-keyring.h>
>>   #include <net/sock.h>
>>   #include <net/tcp.h>
>> +#include <net/tls.h>
>> +#include <net/handshake.h>
>>   #include <linux/blk-mq.h>
>>   #include <crypto/hash.h>
>>   #include <net/busy_poll.h>
>> @@ -31,6 +35,16 @@ static int so_priority;
>>   module_param(so_priority, int, 0644);
>>   MODULE_PARM_DESC(so_priority, "nvme tcp socket optimize priority");
>> +#ifdef CONFIG_NVME_TCP_TLS
>> +/*
>> + * TLS handshake timeout
>> + */
>> +static int tls_handshake_timeout = 10;
>> +module_param(tls_handshake_timeout, int, 0644);
>> +MODULE_PARM_DESC(tls_handshake_timeout,
>> +         "nvme TLS handshake timeout in seconds (default 10)");
>> +#endif
>> +
>>   #ifdef CONFIG_DEBUG_LOCK_ALLOC
>>   /* lockdep can detect a circular dependency of the form
>>    *   sk_lock -> mmap_lock (page fault) -> fs locks -> sk_lock
>> @@ -146,7 +160,10 @@ struct nvme_tcp_queue {
>>       struct ahash_request    *snd_hash;
>>       __le32            exp_ddgst;
>>       __le32            recv_ddgst;
>> -
>> +#ifdef CONFIG_NVME_TCP_TLS
>> +    struct completion       tls_complete;
>> +    int                     tls_err;
>> +#endif
>>       struct page_frag_cache    pf_cache;
>>       void (*state_change)(struct sock *);
>> @@ -1509,7 +1526,80 @@ static void nvme_tcp_set_queue_io_cpu(struct 
>> nvme_tcp_queue *queue)
>>       queue->io_cpu = cpumask_next_wrap(n - 1, cpu_online_mask, -1, 
>> false);
>>   }
>> -static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, int qid)
>> +#ifdef CONFIG_NVME_TCP_TLS
>> +static void nvme_tcp_tls_done(void *data, int status, key_serial_t 
>> pskid)
>> +{
>> +    struct nvme_tcp_queue *queue = data;
>> +    struct nvme_tcp_ctrl *ctrl = queue->ctrl;
>> +    int qid = nvme_tcp_queue_id(queue);
>> +    struct key *tls_key;
>> +
>> +    dev_dbg(ctrl->ctrl.device, "queue %d: TLS handshake done, key %x, 
>> status %d\n",
>> +        qid, pskid, status);
>> +
>> +    if (status) {
>> +        queue->tls_err = -status;
>> +        goto out_complete;
>> +    }
>> +
>> +    tls_key = key_lookup(pskid);
>> +    if (IS_ERR(tls_key)) {
>> +        dev_warn(ctrl->ctrl.device, "queue %d: Invalid key %x\n",
>> +             qid, pskid);
>> +        queue->tls_err = -ENOKEY;
>> +    } else {
>> +        ctrl->ctrl.tls_key = tls_key;
>> +        queue->tls_err = 0;
>> +    }
>> +
>> +out_complete:
>> +    complete(&queue->tls_complete);
>> +}
>> +
>> +static int nvme_tcp_start_tls(struct nvme_ctrl *nctrl,
>> +                  struct nvme_tcp_queue *queue,
>> +                  key_serial_t pskid)
>> +{
>> +    int qid = nvme_tcp_queue_id(queue);
>> +    int ret;
>> +    struct tls_handshake_args args;
>> +    unsigned long tmo = tls_handshake_timeout * HZ;
>> +    key_serial_t keyring = nvme_keyring_id();
>> +
>> +    dev_dbg(nctrl->device, "queue %d: start TLS with key %x\n",
>> +        qid, pskid);
>> +    memset(&args, 0, sizeof(args));
>> +    args.ta_sock = queue->sock;
>> +    args.ta_done = nvme_tcp_tls_done;
>> +    args.ta_data = queue;
>> +    args.ta_my_peerids[0] = pskid;
>> +    args.ta_num_peerids = 1;
>> +    args.ta_keyring = keyring;
>> +    args.ta_timeout_ms = tls_handshake_timeout * 1000;
>> +    queue->tls_err = -EOPNOTSUPP;
>> +    init_completion(&queue->tls_complete);
>> +    ret = tls_client_hello_psk(&args, GFP_KERNEL);
>> +    if (ret) {
>> +        dev_err(nctrl->device, "queue %d: failed to start TLS: %d\n",
>> +            qid, ret);
>> +        return ret;
>> +    }
>> +    if (wait_for_completion_timeout(&queue->tls_complete, tmo) == 0) {
> 
> Maybe this should be interruptible as well?
> 
Hmm. Maybe. I'll look into it.

>> +        dev_err(nctrl->device,
>> +            "queue %d: TLS handshake timeout\n", qid);
>> +        ret = -ETIMEDOUT;
>> +    } else {
>> +        dev_dbg(nctrl->device,
>> +            "queue %d: TLS handshake complete, error %d\n",
>> +            qid, queue->tls_err);
>> +        ret = queue->tls_err;
>> +    }
>> +    return ret;
>> +}
>> +#endif
>> +
>> +static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, int qid,
>> +                key_serial_t pskid)
>>   {
>>       struct nvme_tcp_ctrl *ctrl = to_tcp_ctrl(nctrl);
>>       struct nvme_tcp_queue *queue = &ctrl->queues[qid];
>> @@ -1633,6 +1723,14 @@ static int nvme_tcp_alloc_queue(struct 
>> nvme_ctrl *nctrl, int qid)
>>           goto err_rcv_pdu;
>>       }
>> +#ifdef CONFIG_NVME_TCP_TLS
> 
> pskid is passed unconditionally, is the ifdef needed?
> maybe make nvme_tcp_start_tls stub if not defined?
> 
Yes, correct.
It's not defined without the #ifdef.

>> +    /* If PSKs are configured try to start TLS */
>> +    if (pskid) {
>> +        ret = nvme_tcp_start_tls(nctrl, queue, pskid);
>> +        if (ret)
>> +            goto err_init_connect;
>> +    }
>> +#endif
>>       ret = nvme_tcp_init_connection(queue);
>>       if (ret)
>>           goto err_init_connect;
>> @@ -1782,10 +1880,22 @@ static int nvme_tcp_start_io_queues(struct 
>> nvme_ctrl *ctrl,
>>   static int nvme_tcp_alloc_admin_queue(struct nvme_ctrl *ctrl)
>>   {
>>       int ret;
>> +    key_serial_t psk_id = 0;
>> +
>> +    if (ctrl->opts->tls) {
>> +        psk_id = nvme_tls_psk_default(NULL,
>> +                          ctrl->opts->host->nqn,
>> +                          ctrl->opts->subsysnqn);
> 
> is the returned tls_key_id guaranteed to be non-zero?
> 
I _think_ so; but let me check.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 10/14] nvme-fabrics: parse options 'keyring' and 'tls_key'
  2023-08-07  8:23   ` Sagi Grimberg
@ 2023-08-07  8:34     ` Hannes Reinecke
  0 siblings, 0 replies; 45+ messages in thread
From: Hannes Reinecke @ 2023-08-07  8:34 UTC (permalink / raw)
  To: Sagi Grimberg, Christoph Hellwig; +Cc: Keith Busch, linux-nvme

On 8/7/23 10:23, Sagi Grimberg wrote:
> 
> 
> On 8/3/23 13:50, Hannes Reinecke wrote:
>> Parse the fabrics options 'keyring' and 'tls_key' and store the
>> referenced keys in the options structure.
>>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>> ---
>>   drivers/nvme/host/fabrics.c | 69 ++++++++++++++++++++++++++++++++++++-
>>   drivers/nvme/host/fabrics.h |  6 ++++
>>   drivers/nvme/host/tcp.c     | 11 ++++--
>>   3 files changed, 82 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
>> index ddad482c3537..6bdcd505a477 100644
>> --- a/drivers/nvme/host/fabrics.c
>> +++ b/drivers/nvme/host/fabrics.c
>> @@ -643,6 +643,10 @@ static const match_table_t opt_tokens = {
>>       { NVMF_OPT_NR_WRITE_QUEUES,    "nr_write_queues=%d"    },
>>       { NVMF_OPT_NR_POLL_QUEUES,    "nr_poll_queues=%d"    },
>>       { NVMF_OPT_TOS,            "tos=%d"        },
>> +#ifdef CONFIG_NVME_TCP_TLS
>> +    { NVMF_OPT_KEYRING,        "keyring=%d"        },
>> +    { NVMF_OPT_TLS_KEY,        "tls_key=%d"        },
>> +#endif
>>       { NVMF_OPT_FAIL_FAST_TMO,    "fast_io_fail_tmo=%d"    },
>>       { NVMF_OPT_DISCOVERY,        "discovery"        },
>>       { NVMF_OPT_DHCHAP_SECRET,    "dhchap_secret=%s"    },
>> @@ -660,9 +664,10 @@ static int nvmf_parse_options(struct 
>> nvmf_ctrl_options *opts,
>>       char *options, *o, *p;
>>       int token, ret = 0;
>>       size_t nqnlen  = 0;
>> -    int ctrl_loss_tmo = NVMF_DEF_CTRL_LOSS_TMO;
>> +    int ctrl_loss_tmo = NVMF_DEF_CTRL_LOSS_TMO, key_id;
>>       uuid_t hostid;
>>       char hostnqn[NVMF_NQN_SIZE];
>> +    struct key *key = NULL;
>>       /* Set defaults */
>>       opts->queue_size = NVMF_DEF_QUEUE_SIZE;
>> @@ -928,6 +933,66 @@ static int nvmf_parse_options(struct 
>> nvmf_ctrl_options *opts,
>>               }
>>               opts->tos = token;
>>               break;
>> +        case NVMF_OPT_KEYRING:
>> +            if (!IS_ENABLED(CONFIG_NVME_TCP_TLS)) {
>> +                pr_err("TLS is not supported\n");
>> +                ret = -EINVAL;
>> +                goto out;
>> +            }
>> +            if (match_int(args, &key_id)) {
>> +                ret = -EINVAL;
>> +                goto out;
>> +            }
>> +            if (key_id < 0) {
>> +                pr_err("Invalid keyring id %d\n", key_id);
>> +                ret = -EINVAL;
>> +                goto out;
>> +            }
>> +            if (!key_id) {
>> +                pr_debug("Using default keyring\n");
>> +                key_put(opts->keyring);
>> +                opts->keyring = NULL;
>> +                break;
>> +            }
>> +            key = key_lookup(key_id);
>> +            if (!key) {
>> +                pr_err("Keyring id %08x not found\n", key_id);
>> +                ret = -ENOKEY;
>> +                goto out;
>> +            }
>> +            key_put(opts->keyring);
>> +            opts->keyring = key;
>> +            break;
>> +        case NVMF_OPT_TLS_KEY:
>> +            if (!IS_ENABLED(CONFIG_NVME_TCP_TLS)) {
>> +                pr_err("TLS is not supported\n");
>> +                ret = -EINVAL;
>> +                goto out;
>> +            }
>> +            if (match_int(args, &key_id)) {
>> +                ret = -EINVAL;
>> +                goto out;
>> +            }
>> +            if (key_id < 0) {
>> +                pr_err("Invalid key id %d\n", key_id);
>> +                ret = -EINVAL;
>> +                goto out;
>> +            }
>> +            if (!key_id) {
>> +                pr_debug("Using 'best' PSK\n");
>> +                key_put(opts->tls_key);
>> +                opts->tls_key = NULL;
>> +                break;
>> +            }
>> +            key = key_lookup(key_id);
>> +            if (!key) {
>> +                pr_err("Key id %08x not found\n", key_id);
>> +                ret = -ENOKEY;
>> +                goto out;
>> +            }
>> +            key_put(opts->tls_key);
>> +            opts->tls_key = key;
>> +            break;
> 
> Maybe move to helper(s)? looks pretty similar.
> 
I'll check.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 12/14] nvmet-tcp: allocate socket file
  2023-08-07  8:27   ` Sagi Grimberg
@ 2023-08-07  8:49     ` Hannes Reinecke
  2023-08-07  8:53       ` Sagi Grimberg
  0 siblings, 1 reply; 45+ messages in thread
From: Hannes Reinecke @ 2023-08-07  8:49 UTC (permalink / raw)
  To: Sagi Grimberg, Christoph Hellwig; +Cc: Keith Busch, linux-nvme

On 8/7/23 10:27, Sagi Grimberg wrote:
> 
> 
> On 8/3/23 13:51, Hannes Reinecke wrote:
>> For the TLS upcall we need to allocate a socket file such
>> that the userspace daemon is able to use the socket.
>>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>> ---
>>   drivers/nvme/target/tcp.c | 29 +++++++++++++++++------------
>>   1 file changed, 17 insertions(+), 12 deletions(-)
>>
>> diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
>> index 868aa4de2e4c..fdc351f591a4 100644
>> --- a/drivers/nvme/target/tcp.c
>> +++ b/drivers/nvme/target/tcp.c
>> @@ -1493,12 +1493,12 @@ static void 
>> nvmet_tcp_release_queue_work(struct work_struct *w)
>>       nvmet_sq_destroy(&queue->nvme_sq);
>>       cancel_work_sync(&queue->io_work);
>>       nvmet_tcp_free_cmd_data_in_buffers(queue);
>> -    sock_release(queue->sock);
>> +    /* ->sock will be released by fput() */
>> +    fput(queue->sock->file);
>>       nvmet_tcp_free_cmds(queue);
>>       if (queue->hdr_digest || queue->data_digest)
>>           nvmet_tcp_free_crypto(queue);
>>       ida_free(&nvmet_tcp_queue_ida, queue->idx);
>> -
>>       page = virt_to_head_page(queue->pf_cache.va);
>>       __page_frag_cache_drain(page, queue->pf_cache.pagecnt_bias);
>>       kfree(queue);
>> @@ -1621,15 +1621,16 @@ static int nvmet_tcp_set_queue_sock(struct 
>> nvmet_tcp_queue *queue)
>>       return ret;
>>   }
>> -static int nvmet_tcp_alloc_queue(struct nvmet_tcp_port *port,
>> +static void nvmet_tcp_alloc_queue(struct nvmet_tcp_port *port,
>>           struct socket *newsock)
> 
> Why does the function change from retcode to void in this patch?

Because the return code was never evaluated.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 13/14] nvmet-tcp: enable TLS handshake upcall
  2023-08-03 10:51 ` [PATCH 13/14] nvmet-tcp: enable TLS handshake upcall Hannes Reinecke
@ 2023-08-07  8:51   ` Sagi Grimberg
  2023-08-07  9:15     ` Hannes Reinecke
  0 siblings, 1 reply; 45+ messages in thread
From: Sagi Grimberg @ 2023-08-07  8:51 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig; +Cc: Keith Busch, linux-nvme



On 8/3/23 13:51, Hannes Reinecke wrote:
> Add functions to start the TLS handshake upcall when
> the TCP TSAS sectype is set to 'tls1.3' and add a config
> option NVME_TARGET_TCP_TLS.
> 
> Signed-off-by: Hannes Reincke <hare@suse.de>
> ---
>   drivers/nvme/target/Kconfig    |  14 ++++
>   drivers/nvme/target/configfs.c |  63 +++++++++++++++-
>   drivers/nvme/target/nvmet.h    |   1 +
>   drivers/nvme/target/tcp.c      | 133 +++++++++++++++++++++++++++++++--
>   4 files changed, 203 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/nvme/target/Kconfig b/drivers/nvme/target/Kconfig
> index 79fc64035ee3..15af2b4341f6 100644
> --- a/drivers/nvme/target/Kconfig
> +++ b/drivers/nvme/target/Kconfig
> @@ -84,6 +84,20 @@ config NVME_TARGET_TCP
>   
>   	  If unsure, say N.
>   
> +config NVME_TARGET_TCP_TLS
> +	bool "NVMe over Fabrics TCP target TLS encryption support"
> +	depends on NVME_TARGET_TCP
> +	select NVME_COMMON
> +	select NVME_KEYRING
> +	select NET_HANDSHAKE
> +	help
> +	  Enables TLS encryption for the NVMe TCP target using the netlink handshake API.
> +
> +	  The TLS handshake daemon is availble at
> +	  https://github.com/oracle/ktls-utils.
> +
> +	  If unsure, say N.
> +
>   config NVME_TARGET_AUTH
>   	bool "NVMe over Fabrics In-band Authentication support"
>   	depends on NVME_TARGET
> diff --git a/drivers/nvme/target/configfs.c b/drivers/nvme/target/configfs.c
> index 7f826ac8b75c..49b407702ad5 100644
> --- a/drivers/nvme/target/configfs.c
> +++ b/drivers/nvme/target/configfs.c
> @@ -15,6 +15,7 @@
>   #ifdef CONFIG_NVME_TARGET_AUTH
>   #include <linux/nvme-auth.h>
>   #endif
> +#include <linux/nvme-keyring.h>
>   #include <crypto/hash.h>
>   #include <crypto/kpp.h>
>   
> @@ -159,10 +160,15 @@ static const struct nvmet_type_name_map nvmet_addr_treq[] = {
>   	{ NVMF_TREQ_NOT_REQUIRED,	"not required" },
>   };
>   
> +static inline u8 nvmet_port_treq(struct nvmet_port *port)
> +{
> +	return (port->disc_addr.treq & NVME_TREQ_SECURE_CHANNEL_MASK);
> +}
> +
>   static ssize_t nvmet_addr_treq_show(struct config_item *item, char *page)
>   {
> -	u8 treq = to_nvmet_port(item)->disc_addr.treq &
> -		NVME_TREQ_SECURE_CHANNEL_MASK;
> +	struct nvmet_port *port = to_nvmet_port(item);
> +	u8 treq = nvmet_port_treq(port);
>   	int i;
>   
>   	for (i = 0; i < ARRAY_SIZE(nvmet_addr_treq); i++) {
> @@ -174,11 +180,16 @@ static ssize_t nvmet_addr_treq_show(struct config_item *item, char *page)
>   	return snprintf(page, PAGE_SIZE, "\n");
>   }
>   
> +static inline u8 nvmet_port_treq_mask(struct nvmet_port *port)
> +{
> +	return (port->disc_addr.treq & ~NVME_TREQ_SECURE_CHANNEL_MASK);
> +}
> +
>   static ssize_t nvmet_addr_treq_store(struct config_item *item,
>   		const char *page, size_t count)
>   {
>   	struct nvmet_port *port = to_nvmet_port(item);
> -	u8 treq = port->disc_addr.treq & ~NVME_TREQ_SECURE_CHANNEL_MASK;
> +	u8 treq = nvmet_port_treq_mask(port);
>   	int i;
>   
>   	if (nvmet_is_port_enabled(port, __func__))
> @@ -193,6 +204,23 @@ static ssize_t nvmet_addr_treq_store(struct config_item *item,
>   	return -EINVAL;
>   
>   found:
> +	if (port->disc_addr.trtype == NVMF_TRTYPE_TCP) {
> +		if (!IS_ENABLED(CONFIG_NVME_TARGET_TCP_TLS)) {
> +			pr_err("TLS is not supported\n");
> +			return -EINVAL;
> +		}
> +		if (!port->keyring) {
> +			pr_err("TLS keyring not configured\n");
> +			return -EINVAL;
> +		}
> +		if (port->disc_addr.tsas.tcp.sectype != NVMF_TCP_SECTYPE_TLS13) {
> +			pr_warn("cannot change TREQ when TLS is not enabled\n");
> +			return -EINVAL;
> +		} else if (nvmet_addr_treq[i].type == NVMF_TREQ_NOT_SPECIFIED) {
> +			pr_warn("cannot set TREQ to 'not specified' when TLS is enabled\n");
> +			return -EINVAL;
> +		}
> +	}
>   	treq |= nvmet_addr_treq[i].type;
>   	port->disc_addr.treq = treq;
>   	return count;
> @@ -371,6 +399,7 @@ static ssize_t nvmet_addr_tsas_store(struct config_item *item,
>   		const char *page, size_t count)
>   {
>   	struct nvmet_port *port = to_nvmet_port(item);
> +	u8 treq = nvmet_port_treq_mask(port);
>   	int i;
>   
>   	if (nvmet_is_port_enabled(port, __func__))
> @@ -379,6 +408,15 @@ static ssize_t nvmet_addr_tsas_store(struct config_item *item,
>   	if (port->disc_addr.trtype != NVMF_TRTYPE_TCP)
>   		return -EINVAL;
>   
> +	if (!IS_ENABLED(CONFIG_NVME_TARGET_TCP_TLS)) {
> +		pr_err("TLS is not supported\n");
> +		return -EINVAL;
> +	}
> +	if (!port->keyring) {
> +		pr_err("TLS keyring not configured\n");
> +		return -EINVAL;
> +	}
> +
>   	for (i = 0; i < ARRAY_SIZE(nvmet_addr_tsas_tcp); i++) {
>   		if (sysfs_streq(page, nvmet_addr_tsas_tcp[i].name))
>   			goto found;
> @@ -389,6 +427,16 @@ static ssize_t nvmet_addr_tsas_store(struct config_item *item,
>   
>   found:
>   	nvmet_port_init_tsas_tcp(port, nvmet_addr_tsas_tcp[i].type);
> +	if (nvmet_addr_tsas_tcp[i].type == NVMF_TCP_SECTYPE_TLS13) {
> +		if (nvmet_port_treq(port) == NVMF_TREQ_NOT_SPECIFIED)
> +			treq |= NVMF_TREQ_REQUIRED;
> +		else
> +			treq |= nvmet_port_treq(port);
> +	} else {
> +		/* Set to 'not specified' if TLS is not enabled */
> +		treq |= NVMF_TREQ_NOT_SPECIFIED;
> +	}
> +	port->disc_addr.treq = treq;
>   	return count;
>   }
>   

Can the treq/tsas be split from the actual nvmet-tcp upcall addition?

> @@ -1795,6 +1843,7 @@ static void nvmet_port_release(struct config_item *item)
>   	flush_workqueue(nvmet_wq);
>   	list_del(&port->global_entry);
>   
> +	key_put(port->keyring);
>   	kfree(port->ana_state);
>   	kfree(port);
>   }
> @@ -1844,6 +1893,14 @@ static struct config_group *nvmet_ports_make(struct config_group *group,
>   		return ERR_PTR(-ENOMEM);
>   	}
>   
> +	if (nvme_keyring_id()) {
> +		port->keyring = key_lookup(nvme_keyring_id());
> +		if (IS_ERR(port->keyring)) {
> +			pr_warn("NVMe keyring not available, disabling TLS\n");
> +			port->keyring = NULL;
> +		}
> +	}
> +

Nice.

>   	for (i = 1; i <= NVMET_MAX_ANAGRPS; i++) {
>   		if (i == NVMET_DEFAULT_ANA_GRPID)
>   			port->ana_state[1] = NVME_ANA_OPTIMIZED;
> diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h
> index 8cfd60f3b564..7f9ae53c1df5 100644
> --- a/drivers/nvme/target/nvmet.h
> +++ b/drivers/nvme/target/nvmet.h
> @@ -158,6 +158,7 @@ struct nvmet_port {
>   	struct config_group		ana_groups_group;
>   	struct nvmet_ana_group		ana_default_group;
>   	enum nvme_ana_state		*ana_state;
> +	struct key			*keyring;
>   	void				*priv;
>   	bool				enabled;
>   	int				inline_data_size;
> diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
> index fdc351f591a4..7279c994abd6 100644
> --- a/drivers/nvme/target/tcp.c
> +++ b/drivers/nvme/target/tcp.c
> @@ -8,9 +8,13 @@
>   #include <linux/init.h>
>   #include <linux/slab.h>
>   #include <linux/err.h>
> +#include <linux/key.h>
>   #include <linux/nvme-tcp.h>
> +#include <linux/nvme-keyring.h>
>   #include <net/sock.h>
>   #include <net/tcp.h>
> +#include <net/tls.h>
> +#include <net/handshake.h>
>   #include <linux/inet.h>
>   #include <linux/llist.h>
>   #include <crypto/hash.h>
> @@ -66,6 +70,16 @@ device_param_cb(idle_poll_period_usecs, &set_param_ops,
>   MODULE_PARM_DESC(idle_poll_period_usecs,
>   		"nvmet tcp io_work poll till idle time period in usecs: Default 0");
>   
> +#ifdef CONFIG_NVME_TARGET_TCP_TLS
> +/*
> + * TLS handshake timeout
> + */
> +static int tls_handshake_timeout = 30;
> +module_param(tls_handshake_timeout, int, 0644);
> +MODULE_PARM_DESC(tls_handshake_timeout,
> +		 "nvme TLS handshake timeout in seconds (default 30)");
> +#endif

On the host it is 10 and here 30? what is the source of the assymmetry?

> +
>   #define NVMET_TCP_RECV_BUDGET		8
>   #define NVMET_TCP_SEND_BUDGET		8
>   #define NVMET_TCP_IO_WORK_BUDGET	64
> @@ -122,6 +136,7 @@ struct nvmet_tcp_cmd {
>   
>   enum nvmet_tcp_queue_state {
>   	NVMET_TCP_Q_CONNECTING,
> +	NVMET_TCP_Q_TLS_HANDSHAKE,
>   	NVMET_TCP_Q_LIVE,
>   	NVMET_TCP_Q_DISCONNECTING,
>   };
> @@ -154,6 +169,8 @@ struct nvmet_tcp_queue {
>   	bool			data_digest;
>   	struct ahash_request	*snd_hash;
>   	struct ahash_request	*rcv_hash;
> +	key_serial_t		tls_pskid;
> +	struct delayed_work	tls_handshake_work;
>   
>   	unsigned long           poll_end;
>   
> @@ -1285,12 +1302,12 @@ static int nvmet_tcp_try_recv(struct nvmet_tcp_queue *queue,
>   
>   static void nvmet_tcp_schedule_release_queue(struct nvmet_tcp_queue *queue)
>   {
> -	spin_lock(&queue->state_lock);
> +	spin_lock_irq(&queue->state_lock);

Where is this lock taken in irq context that needs disabling irq?

>   	if (queue->state != NVMET_TCP_Q_DISCONNECTING) {
>   		queue->state = NVMET_TCP_Q_DISCONNECTING;
>   		queue_work(nvmet_wq, &queue->release_work);
>   	}
> -	spin_unlock(&queue->state_lock);
> +	spin_unlock_irq(&queue->state_lock);
>   }
>   
>   static inline void nvmet_tcp_arm_queue_deadline(struct nvmet_tcp_queue *queue)
> @@ -1512,8 +1529,12 @@ static void nvmet_tcp_data_ready(struct sock *sk)
>   
>   	read_lock_bh(&sk->sk_callback_lock);
>   	queue = sk->sk_user_data;
> -	if (likely(queue))
> -		queue_work_on(queue_cpu(queue), nvmet_tcp_wq, &queue->io_work);
> +	if (queue->data_ready)
> +		queue->data_ready(sk);
> +	if (likely(queue) &&
> +	    queue->state != NVMET_TCP_Q_TLS_HANDSHAKE)
> +		queue_work_on(queue_cpu(queue), nvmet_tcp_wq,
> +			      &queue->io_work);
>   	read_unlock_bh(&sk->sk_callback_lock);
>   }
>   
> @@ -1621,6 +1642,85 @@ static int nvmet_tcp_set_queue_sock(struct nvmet_tcp_queue *queue)
>   	return ret;
>   }
>   
> +#ifdef CONFIG_NVME_TARGET_TCP_TLS
> +static void nvmet_tcp_tls_queue_reset(struct nvmet_tcp_queue *queue)
> +{
> +	spin_lock_irq(&queue->state_lock);
> +	if (queue->state != NVMET_TCP_Q_TLS_HANDSHAKE) {
> +		pr_warn("queue %d: TLS handshake already completed\n",
> +			queue->idx);
> +		spin_unlock_irq(&queue->state_lock);
> +		return;

trigger fatal error here?

> +	}
> +	queue->state = NVMET_TCP_Q_CONNECTING;
> +	spin_unlock_irq(&queue->state_lock);
> +
> +	pr_debug("queue %d: resetting queue callbacks after TLS handshake\n",
> +		 queue->idx);
> +	/*
> +	 * Set callbacks after handshake; TLS implementation
> +	 * might have changed the socket callbacks.
> +	 */
> +	nvmet_tcp_set_queue_sock(queue);
> +}
> +
> +static void nvmet_tcp_tls_handshake_done(void *data, int status,
> +					 key_serial_t peerid)

lets call peerid psk_id throughout.

> +{
> +	struct nvmet_tcp_queue *queue = data;
> +
> +	pr_debug("queue %d: TLS handshake done, key %x, status %d\n",
> +		 queue->idx, peerid, status);
> +	if (!status) {
> +		spin_lock_irq(&queue->state_lock);
> +		queue->tls_pskid = peerid;
> +		spin_unlock_irq(&queue->state_lock);
> +	}
> +	cancel_delayed_work_sync(&queue->tls_handshake_work);

Hmm, the cancel_delayed_work_sync is scary.

What happens if it ran and already scheduled a release (which
already ran and completed)?

> +	if (status)
> +		nvmet_tcp_schedule_release_queue(queue);
> +	else
> +		nvmet_tcp_tls_queue_reset(queue);

What I think you need is an atomic state that one or the
other access, and then you are fine with a normal async
cancel of the delayed_work.

> +}
> +
> +static void nvmet_tcp_tls_handshake_timeout_work(struct work_struct *w)
> +{
> +	struct nvmet_tcp_queue *queue = container_of(to_delayed_work(w),
> +			struct nvmet_tcp_queue, tls_handshake_work);
> +
> +	pr_debug("queue %d: TLS handshake timeout\n", queue->idx);
> +	nvmet_tcp_schedule_release_queue(queue);
> +}
> +
> +static int nvmet_tcp_tls_handshake(struct nvmet_tcp_queue *queue)
> +{
> +	int ret = -EOPNOTSUPP;
> +	struct tls_handshake_args args;
> +
> +	if (queue->state != NVMET_TCP_Q_TLS_HANDSHAKE) {
> +		pr_warn("cannot start TLS in state %d\n", queue->state);
> +		return -EINVAL;
> +	}
> +
> +	pr_debug("queue %d: TLS ServerHello\n", queue->idx);
> +	memset(&args, 0, sizeof(args));
> +	args.ta_sock = queue->sock;
> +	args.ta_done = nvmet_tcp_tls_handshake_done;
> +	args.ta_data = queue;
> +	args.ta_keyring = key_serial(queue->port->nport->keyring);
> +	args.ta_timeout_ms = tls_handshake_timeout * 2 * 1024;

* 2 * 1024 ? I didn't know we have 2048 ms in a second...

> +
> +	ret = tls_server_hello_psk(&args, GFP_KERNEL);
> +	if (ret) {
> +		pr_err("failed to start TLS, err=%d\n", ret);
> +	} else {
> +		queue_delayed_work(nvmet_wq, &queue->tls_handshake_work,
> +				   tls_handshake_timeout * HZ);
> +	}
> +	return ret;
> +}
> +#endif
> +
>   static void nvmet_tcp_alloc_queue(struct nvmet_tcp_port *port,
>   		struct socket *newsock)
>   {
> @@ -1638,7 +1738,11 @@ static void nvmet_tcp_alloc_queue(struct nvmet_tcp_port *port,
>   	queue->port = port;
>   	queue->nr_cmds = 0;
>   	spin_lock_init(&queue->state_lock);
> -	queue->state = NVMET_TCP_Q_CONNECTING;
> +	if (queue->port->nport->disc_addr.tsas.tcp.sectype ==
> +	    NVMF_TCP_SECTYPE_TLS13)
> +		queue->state = NVMET_TCP_Q_TLS_HANDSHAKE;
> +	else
> +		queue->state = NVMET_TCP_Q_CONNECTING;
>   	INIT_LIST_HEAD(&queue->free_list);
>   	init_llist_head(&queue->resp_list);
>   	INIT_LIST_HEAD(&queue->resp_send_list);
> @@ -1669,6 +1773,25 @@ static void nvmet_tcp_alloc_queue(struct nvmet_tcp_port *port,
>   	list_add_tail(&queue->queue_list, &nvmet_tcp_queue_list);
>   	mutex_unlock(&nvmet_tcp_queue_mutex);
>   
> +#ifdef CONFIG_NVME_TARGET_TCP_TLS

maybe the ifdef can be avoided with stubs?

> +	INIT_DELAYED_WORK(&queue->tls_handshake_work,
> +			  nvmet_tcp_tls_handshake_timeout_work);
> +	if (queue->state == NVMET_TCP_Q_TLS_HANDSHAKE) {
> +		struct sock *sk = queue->sock->sk;
> +
> +		/* Restore the default callbacks before starting upcall */
> +		read_lock_bh(&sk->sk_callback_lock);
> +		sk->sk_user_data = NULL;
> +		sk->sk_data_ready = port->data_ready;
> +		read_unlock_bh(&sk->sk_callback_lock);
> +		if (!nvmet_tcp_tls_handshake(queue))
> +			return;
> +
> +		/* TLS handshake failed, terminate the connection */
> +		goto out_destroy_sq;
> +	}
> +#endif
> +
>   	ret = nvmet_tcp_set_queue_sock(queue);
>   	if (ret)
>   		goto out_destroy_sq;


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 12/14] nvmet-tcp: allocate socket file
  2023-08-07  8:49     ` Hannes Reinecke
@ 2023-08-07  8:53       ` Sagi Grimberg
  2023-08-07  9:17         ` Hannes Reinecke
  0 siblings, 1 reply; 45+ messages in thread
From: Sagi Grimberg @ 2023-08-07  8:53 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig; +Cc: Keith Busch, linux-nvme


>>> -static int nvmet_tcp_alloc_queue(struct nvmet_tcp_port *port,
>>> +static void nvmet_tcp_alloc_queue(struct nvmet_tcp_port *port,
>>>           struct socket *newsock)
>>
>> Why does the function change from retcode to void in this patch?
> 
> Because the return code was never evaluated.

??

I see it is evaluated.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 13/14] nvmet-tcp: enable TLS handshake upcall
  2023-08-07  8:51   ` Sagi Grimberg
@ 2023-08-07  9:15     ` Hannes Reinecke
  2023-08-07 11:49       ` Sagi Grimberg
  0 siblings, 1 reply; 45+ messages in thread
From: Hannes Reinecke @ 2023-08-07  9:15 UTC (permalink / raw)
  To: Sagi Grimberg, Christoph Hellwig; +Cc: Keith Busch, linux-nvme

On 8/7/23 10:51, Sagi Grimberg wrote:
> 
> 
> On 8/3/23 13:51, Hannes Reinecke wrote:
>> Add functions to start the TLS handshake upcall when
>> the TCP TSAS sectype is set to 'tls1.3' and add a config
>> option NVME_TARGET_TCP_TLS.
>>
>> Signed-off-by: Hannes Reincke <hare@suse.de>
>> ---
>>   drivers/nvme/target/Kconfig    |  14 ++++
>>   drivers/nvme/target/configfs.c |  63 +++++++++++++++-
>>   drivers/nvme/target/nvmet.h    |   1 +
>>   drivers/nvme/target/tcp.c      | 133 +++++++++++++++++++++++++++++++--
>>   4 files changed, 203 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/nvme/target/Kconfig b/drivers/nvme/target/Kconfig
>> index 79fc64035ee3..15af2b4341f6 100644
>> --- a/drivers/nvme/target/Kconfig
>> +++ b/drivers/nvme/target/Kconfig
>> @@ -84,6 +84,20 @@ config NVME_TARGET_TCP
>>         If unsure, say N.
>> +config NVME_TARGET_TCP_TLS
>> +    bool "NVMe over Fabrics TCP target TLS encryption support"
>> +    depends on NVME_TARGET_TCP
>> +    select NVME_COMMON
>> +    select NVME_KEYRING
>> +    select NET_HANDSHAKE
>> +    help
>> +      Enables TLS encryption for the NVMe TCP target using the 
>> netlink handshake API.
>> +
>> +      The TLS handshake daemon is availble at
>> +      https://github.com/oracle/ktls-utils.
>> +
>> +      If unsure, say N.
>> +
>>   config NVME_TARGET_AUTH
>>       bool "NVMe over Fabrics In-band Authentication support"
>>       depends on NVME_TARGET
>> diff --git a/drivers/nvme/target/configfs.c 
>> b/drivers/nvme/target/configfs.c
>> index 7f826ac8b75c..49b407702ad5 100644
>> --- a/drivers/nvme/target/configfs.c
>> +++ b/drivers/nvme/target/configfs.c
>> @@ -15,6 +15,7 @@
>>   #ifdef CONFIG_NVME_TARGET_AUTH
>>   #include <linux/nvme-auth.h>
>>   #endif
>> +#include <linux/nvme-keyring.h>
>>   #include <crypto/hash.h>
>>   #include <crypto/kpp.h>
>> @@ -159,10 +160,15 @@ static const struct nvmet_type_name_map 
>> nvmet_addr_treq[] = {
>>       { NVMF_TREQ_NOT_REQUIRED,    "not required" },
>>   };
>> +static inline u8 nvmet_port_treq(struct nvmet_port *port)
>> +{
>> +    return (port->disc_addr.treq & NVME_TREQ_SECURE_CHANNEL_MASK);
>> +}
>> +
>>   static ssize_t nvmet_addr_treq_show(struct config_item *item, char 
>> *page)
>>   {
>> -    u8 treq = to_nvmet_port(item)->disc_addr.treq &
>> -        NVME_TREQ_SECURE_CHANNEL_MASK;
>> +    struct nvmet_port *port = to_nvmet_port(item);
>> +    u8 treq = nvmet_port_treq(port);
>>       int i;
>>       for (i = 0; i < ARRAY_SIZE(nvmet_addr_treq); i++) {
>> @@ -174,11 +180,16 @@ static ssize_t nvmet_addr_treq_show(struct 
>> config_item *item, char *page)
>>       return snprintf(page, PAGE_SIZE, "\n");
>>   }
>> +static inline u8 nvmet_port_treq_mask(struct nvmet_port *port)
>> +{
>> +    return (port->disc_addr.treq & ~NVME_TREQ_SECURE_CHANNEL_MASK);
>> +}
>> +
>>   static ssize_t nvmet_addr_treq_store(struct config_item *item,
>>           const char *page, size_t count)
>>   {
>>       struct nvmet_port *port = to_nvmet_port(item);
>> -    u8 treq = port->disc_addr.treq & ~NVME_TREQ_SECURE_CHANNEL_MASK;
>> +    u8 treq = nvmet_port_treq_mask(port);
>>       int i;
>>       if (nvmet_is_port_enabled(port, __func__))
>> @@ -193,6 +204,23 @@ static ssize_t nvmet_addr_treq_store(struct 
>> config_item *item,
>>       return -EINVAL;
>>   found:
>> +    if (port->disc_addr.trtype == NVMF_TRTYPE_TCP) {
>> +        if (!IS_ENABLED(CONFIG_NVME_TARGET_TCP_TLS)) {
>> +            pr_err("TLS is not supported\n");
>> +            return -EINVAL;
>> +        }
>> +        if (!port->keyring) {
>> +            pr_err("TLS keyring not configured\n");
>> +            return -EINVAL;
>> +        }
>> +        if (port->disc_addr.tsas.tcp.sectype != 
>> NVMF_TCP_SECTYPE_TLS13) {
>> +            pr_warn("cannot change TREQ when TLS is not enabled\n");
>> +            return -EINVAL;
>> +        } else if (nvmet_addr_treq[i].type == NVMF_TREQ_NOT_SPECIFIED) {
>> +            pr_warn("cannot set TREQ to 'not specified' when TLS is 
>> enabled\n");
>> +            return -EINVAL;
>> +        }
>> +    }
>>       treq |= nvmet_addr_treq[i].type;
>>       port->disc_addr.treq = treq;
>>       return count;
>> @@ -371,6 +399,7 @@ static ssize_t nvmet_addr_tsas_store(struct 
>> config_item *item,
>>           const char *page, size_t count)
>>   {
>>       struct nvmet_port *port = to_nvmet_port(item);
>> +    u8 treq = nvmet_port_treq_mask(port);
>>       int i;
>>       if (nvmet_is_port_enabled(port, __func__))
>> @@ -379,6 +408,15 @@ static ssize_t nvmet_addr_tsas_store(struct 
>> config_item *item,
>>       if (port->disc_addr.trtype != NVMF_TRTYPE_TCP)
>>           return -EINVAL;
>> +    if (!IS_ENABLED(CONFIG_NVME_TARGET_TCP_TLS)) {
>> +        pr_err("TLS is not supported\n");
>> +        return -EINVAL;
>> +    }
>> +    if (!port->keyring) {
>> +        pr_err("TLS keyring not configured\n");
>> +        return -EINVAL;
>> +    }
>> +
>>       for (i = 0; i < ARRAY_SIZE(nvmet_addr_tsas_tcp); i++) {
>>           if (sysfs_streq(page, nvmet_addr_tsas_tcp[i].name))
>>               goto found;
>> @@ -389,6 +427,16 @@ static ssize_t nvmet_addr_tsas_store(struct 
>> config_item *item,
>>   found:
>>       nvmet_port_init_tsas_tcp(port, nvmet_addr_tsas_tcp[i].type);
>> +    if (nvmet_addr_tsas_tcp[i].type == NVMF_TCP_SECTYPE_TLS13) {
>> +        if (nvmet_port_treq(port) == NVMF_TREQ_NOT_SPECIFIED)
>> +            treq |= NVMF_TREQ_REQUIRED;
>> +        else
>> +            treq |= nvmet_port_treq(port);
>> +    } else {
>> +        /* Set to 'not specified' if TLS is not enabled */
>> +        treq |= NVMF_TREQ_NOT_SPECIFIED;
>> +    }
>> +    port->disc_addr.treq = treq;
>>       return count;
>>   }
> 
> Can the treq/tsas be split from the actual nvmet-tcp upcall addition?
> 
I guess it can; it's just that it doesn't make any sense if we don't 
have the TLS upcall implemented.

>> @@ -1795,6 +1843,7 @@ static void nvmet_port_release(struct 
>> config_item *item)
>>       flush_workqueue(nvmet_wq);
>>       list_del(&port->global_entry);
>> +    key_put(port->keyring);
>>       kfree(port->ana_state);
>>       kfree(port);
>>   }
>> @@ -1844,6 +1893,14 @@ static struct config_group 
>> *nvmet_ports_make(struct config_group *group,
>>           return ERR_PTR(-ENOMEM);
>>       }
>> +    if (nvme_keyring_id()) {
>> +        port->keyring = key_lookup(nvme_keyring_id());
>> +        if (IS_ERR(port->keyring)) {
>> +            pr_warn("NVMe keyring not available, disabling TLS\n");
>> +            port->keyring = NULL;
>> +        }
>> +    }
>> +
> 
> Nice.
> 
>>       for (i = 1; i <= NVMET_MAX_ANAGRPS; i++) {
>>           if (i == NVMET_DEFAULT_ANA_GRPID)
>>               port->ana_state[1] = NVME_ANA_OPTIMIZED;
>> diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h
>> index 8cfd60f3b564..7f9ae53c1df5 100644
>> --- a/drivers/nvme/target/nvmet.h
>> +++ b/drivers/nvme/target/nvmet.h
>> @@ -158,6 +158,7 @@ struct nvmet_port {
>>       struct config_group        ana_groups_group;
>>       struct nvmet_ana_group        ana_default_group;
>>       enum nvme_ana_state        *ana_state;
>> +    struct key            *keyring;
>>       void                *priv;
>>       bool                enabled;
>>       int                inline_data_size;
>> diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
>> index fdc351f591a4..7279c994abd6 100644
>> --- a/drivers/nvme/target/tcp.c
>> +++ b/drivers/nvme/target/tcp.c
>> @@ -8,9 +8,13 @@
>>   #include <linux/init.h>
>>   #include <linux/slab.h>
>>   #include <linux/err.h>
>> +#include <linux/key.h>
>>   #include <linux/nvme-tcp.h>
>> +#include <linux/nvme-keyring.h>
>>   #include <net/sock.h>
>>   #include <net/tcp.h>
>> +#include <net/tls.h>
>> +#include <net/handshake.h>
>>   #include <linux/inet.h>
>>   #include <linux/llist.h>
>>   #include <crypto/hash.h>
>> @@ -66,6 +70,16 @@ device_param_cb(idle_poll_period_usecs, 
>> &set_param_ops,
>>   MODULE_PARM_DESC(idle_poll_period_usecs,
>>           "nvmet tcp io_work poll till idle time period in usecs: 
>> Default 0");
>> +#ifdef CONFIG_NVME_TARGET_TCP_TLS
>> +/*
>> + * TLS handshake timeout
>> + */
>> +static int tls_handshake_timeout = 30;
>> +module_param(tls_handshake_timeout, int, 0644);
>> +MODULE_PARM_DESC(tls_handshake_timeout,
>> +         "nvme TLS handshake timeout in seconds (default 30)");
>> +#endif
> 
> On the host it is 10 and here 30? what is the source of the assymmetry?
> 
Hmm. Looks like an oversight. Will be aligning them.

>> +
>>   #define NVMET_TCP_RECV_BUDGET        8
>>   #define NVMET_TCP_SEND_BUDGET        8
>>   #define NVMET_TCP_IO_WORK_BUDGET    64
>> @@ -122,6 +136,7 @@ struct nvmet_tcp_cmd {
>>   enum nvmet_tcp_queue_state {
>>       NVMET_TCP_Q_CONNECTING,
>> +    NVMET_TCP_Q_TLS_HANDSHAKE,
>>       NVMET_TCP_Q_LIVE,
>>       NVMET_TCP_Q_DISCONNECTING,
>>   };
>> @@ -154,6 +169,8 @@ struct nvmet_tcp_queue {
>>       bool            data_digest;
>>       struct ahash_request    *snd_hash;
>>       struct ahash_request    *rcv_hash;
>> +    key_serial_t        tls_pskid;
>> +    struct delayed_work    tls_handshake_work;
>>       unsigned long           poll_end;
>> @@ -1285,12 +1302,12 @@ static int nvmet_tcp_try_recv(struct 
>> nvmet_tcp_queue *queue,
>>   static void nvmet_tcp_schedule_release_queue(struct nvmet_tcp_queue 
>> *queue)
>>   {
>> -    spin_lock(&queue->state_lock);
>> +    spin_lock_irq(&queue->state_lock);
> 
> Where is this lock taken in irq context that needs disabling irq?
> 
Let me check; might be that it got solved with the workqueue fixes which 
went in lately.

>>       if (queue->state != NVMET_TCP_Q_DISCONNECTING) {
>>           queue->state = NVMET_TCP_Q_DISCONNECTING;
>>           queue_work(nvmet_wq, &queue->release_work);
>>       }
>> -    spin_unlock(&queue->state_lock);
>> +    spin_unlock_irq(&queue->state_lock);
>>   }
>>   static inline void nvmet_tcp_arm_queue_deadline(struct 
>> nvmet_tcp_queue *queue)
>> @@ -1512,8 +1529,12 @@ static void nvmet_tcp_data_ready(struct sock *sk)
>>       read_lock_bh(&sk->sk_callback_lock);
>>       queue = sk->sk_user_data;
>> -    if (likely(queue))
>> -        queue_work_on(queue_cpu(queue), nvmet_tcp_wq, &queue->io_work);
>> +    if (queue->data_ready)
>> +        queue->data_ready(sk);
>> +    if (likely(queue) &&
>> +        queue->state != NVMET_TCP_Q_TLS_HANDSHAKE)
>> +        queue_work_on(queue_cpu(queue), nvmet_tcp_wq,
>> +                  &queue->io_work);
>>       read_unlock_bh(&sk->sk_callback_lock);
>>   }
>> @@ -1621,6 +1642,85 @@ static int nvmet_tcp_set_queue_sock(struct 
>> nvmet_tcp_queue *queue)
>>       return ret;
>>   }
>> +#ifdef CONFIG_NVME_TARGET_TCP_TLS
>> +static void nvmet_tcp_tls_queue_reset(struct nvmet_tcp_queue *queue)
>> +{
>> +    spin_lock_irq(&queue->state_lock);
>> +    if (queue->state != NVMET_TCP_Q_TLS_HANDSHAKE) {
>> +        pr_warn("queue %d: TLS handshake already completed\n",
>> +            queue->idx);
>> +        spin_unlock_irq(&queue->state_lock);
>> +        return;
> 
> trigger fatal error here?
> 
Good point.

>> +    }
>> +    queue->state = NVMET_TCP_Q_CONNECTING;
>> +    spin_unlock_irq(&queue->state_lock);
>> +
>> +    pr_debug("queue %d: resetting queue callbacks after TLS 
>> handshake\n",
>> +         queue->idx);
>> +    /*
>> +     * Set callbacks after handshake; TLS implementation
>> +     * might have changed the socket callbacks.
>> +     */
>> +    nvmet_tcp_set_queue_sock(queue);
>> +}
>> +
>> +static void nvmet_tcp_tls_handshake_done(void *data, int status,
>> +                     key_serial_t peerid)
> 
> lets call peerid psk_id throughout.
> 
Okay.

>> +{
>> +    struct nvmet_tcp_queue *queue = data;
>> +
>> +    pr_debug("queue %d: TLS handshake done, key %x, status %d\n",
>> +         queue->idx, peerid, status);
>> +    if (!status) {
>> +        spin_lock_irq(&queue->state_lock);
>> +        queue->tls_pskid = peerid;
>> +        spin_unlock_irq(&queue->state_lock);
>> +    }
>> +    cancel_delayed_work_sync(&queue->tls_handshake_work);
> 
> Hmm, the cancel_delayed_work_sync is scary.
> 
> What happens if it ran and already scheduled a release (which
> already ran and completed)?
> 
Well, we need to stop the timeout at some point; granted, for a failure 
we can just flush the workqueue, but in the success case we need 
terminate the workqueue.
And nvmet_tcp_schedule_release_queue() already checks for the queue 
state, so we should be (reasonably) safe.

>> +    if (status)
>> +        nvmet_tcp_schedule_release_queue(queue);
>> +    else
>> +        nvmet_tcp_tls_queue_reset(queue);
> 
> What I think you need is an atomic state that one or the
> other access, and then you are fine with a normal async
> cancel of the delayed_work.
> 
Hmm. I'll have a look.

>> +}
>> +
>> +static void nvmet_tcp_tls_handshake_timeout_work(struct work_struct *w)
>> +{
>> +    struct nvmet_tcp_queue *queue = container_of(to_delayed_work(w),
>> +            struct nvmet_tcp_queue, tls_handshake_work);
>> +
>> +    pr_debug("queue %d: TLS handshake timeout\n", queue->idx);
>> +    nvmet_tcp_schedule_release_queue(queue);
>> +}
>> +
>> +static int nvmet_tcp_tls_handshake(struct nvmet_tcp_queue *queue)
>> +{
>> +    int ret = -EOPNOTSUPP;
>> +    struct tls_handshake_args args;
>> +
>> +    if (queue->state != NVMET_TCP_Q_TLS_HANDSHAKE) {
>> +        pr_warn("cannot start TLS in state %d\n", queue->state);
>> +        return -EINVAL;
>> +    }
>> +
>> +    pr_debug("queue %d: TLS ServerHello\n", queue->idx);
>> +    memset(&args, 0, sizeof(args));
>> +    args.ta_sock = queue->sock;
>> +    args.ta_done = nvmet_tcp_tls_handshake_done;
>> +    args.ta_data = queue;
>> +    args.ta_keyring = key_serial(queue->port->nport->keyring);
>> +    args.ta_timeout_ms = tls_handshake_timeout * 2 * 1024;
> 
> * 2 * 1024 ? I didn't know we have 2048 ms in a second...
> 
The '_ms' bit is just an indicator of the unit, not the value.
We don't have an 'args.ta_timeout_sec' ...

>> +
>> +    ret = tls_server_hello_psk(&args, GFP_KERNEL);
>> +    if (ret) {
>> +        pr_err("failed to start TLS, err=%d\n", ret);
>> +    } else {
>> +        queue_delayed_work(nvmet_wq, &queue->tls_handshake_work,
>> +                   tls_handshake_timeout * HZ);
>> +    }
>> +    return ret;
>> +}
>> +#endif
>> +
>>   static void nvmet_tcp_alloc_queue(struct nvmet_tcp_port *port,
>>           struct socket *newsock)
>>   {
>> @@ -1638,7 +1738,11 @@ static void nvmet_tcp_alloc_queue(struct 
>> nvmet_tcp_port *port,
>>       queue->port = port;
>>       queue->nr_cmds = 0;
>>       spin_lock_init(&queue->state_lock);
>> -    queue->state = NVMET_TCP_Q_CONNECTING;
>> +    if (queue->port->nport->disc_addr.tsas.tcp.sectype ==
>> +        NVMF_TCP_SECTYPE_TLS13)
>> +        queue->state = NVMET_TCP_Q_TLS_HANDSHAKE;
>> +    else
>> +        queue->state = NVMET_TCP_Q_CONNECTING;
>>       INIT_LIST_HEAD(&queue->free_list);
>>       init_llist_head(&queue->resp_list);
>>       INIT_LIST_HEAD(&queue->resp_send_list);
>> @@ -1669,6 +1773,25 @@ static void nvmet_tcp_alloc_queue(struct 
>> nvmet_tcp_port *port,
>>       list_add_tail(&queue->queue_list, &nvmet_tcp_queue_list);
>>       mutex_unlock(&nvmet_tcp_queue_mutex);
>> +#ifdef CONFIG_NVME_TARGET_TCP_TLS
> 
> maybe the ifdef can be avoided with stubs?
> 
Lemme check.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 12/14] nvmet-tcp: allocate socket file
  2023-08-07  8:53       ` Sagi Grimberg
@ 2023-08-07  9:17         ` Hannes Reinecke
  2023-08-07 10:42           ` Sagi Grimberg
  0 siblings, 1 reply; 45+ messages in thread
From: Hannes Reinecke @ 2023-08-07  9:17 UTC (permalink / raw)
  To: Sagi Grimberg, Christoph Hellwig; +Cc: Keith Busch, linux-nvme

On 8/7/23 10:53, Sagi Grimberg wrote:
> 
>>>> -static int nvmet_tcp_alloc_queue(struct nvmet_tcp_port *port,
>>>> +static void nvmet_tcp_alloc_queue(struct nvmet_tcp_port *port,
>>>>           struct socket *newsock)
>>>
>>> Why does the function change from retcode to void in this patch?
>>
>> Because the return code was never evaluated.
> 
> ??
> 
> I see it is evaluated.

Yes, but this patch moved the 'sock_release()' call into 
nvmet_tcp_alloc_queue(), making the return code obsolete.

But I can make it a separate patch if required.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 12/14] nvmet-tcp: allocate socket file
  2023-08-07  9:17         ` Hannes Reinecke
@ 2023-08-07 10:42           ` Sagi Grimberg
  2023-08-08  6:08             ` Hannes Reinecke
  0 siblings, 1 reply; 45+ messages in thread
From: Sagi Grimberg @ 2023-08-07 10:42 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig; +Cc: Keith Busch, linux-nvme


>>>>> -static int nvmet_tcp_alloc_queue(struct nvmet_tcp_port *port,
>>>>> +static void nvmet_tcp_alloc_queue(struct nvmet_tcp_port *port,
>>>>>           struct socket *newsock)
>>>>
>>>> Why does the function change from retcode to void in this patch?
>>>
>>> Because the return code was never evaluated.
>>
>> ??
>>
>> I see it is evaluated.
> 
> Yes, but this patch moved the 'sock_release()' call into 
> nvmet_tcp_alloc_queue(), making the return code obsolete.
> 
> But I can make it a separate patch if required.

But now who calls sock_release if you failed before allocating
the sockfile?



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 13/14] nvmet-tcp: enable TLS handshake upcall
  2023-08-07  9:15     ` Hannes Reinecke
@ 2023-08-07 11:49       ` Sagi Grimberg
  2023-08-08  6:16         ` Hannes Reinecke
  0 siblings, 1 reply; 45+ messages in thread
From: Sagi Grimberg @ 2023-08-07 11:49 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig; +Cc: Keith Busch, linux-nvme


>>> @@ -389,6 +427,16 @@ static ssize_t nvmet_addr_tsas_store(struct 
>>> config_item *item,
>>>   found:
>>>       nvmet_port_init_tsas_tcp(port, nvmet_addr_tsas_tcp[i].type);
>>> +    if (nvmet_addr_tsas_tcp[i].type == NVMF_TCP_SECTYPE_TLS13) {
>>> +        if (nvmet_port_treq(port) == NVMF_TREQ_NOT_SPECIFIED)
>>> +            treq |= NVMF_TREQ_REQUIRED;
>>> +        else
>>> +            treq |= nvmet_port_treq(port);
>>> +    } else {
>>> +        /* Set to 'not specified' if TLS is not enabled */
>>> +        treq |= NVMF_TREQ_NOT_SPECIFIED;
>>> +    }
>>> +    port->disc_addr.treq = treq;
>>>       return count;
>>>   }
>>
>> Can the treq/tsas be split from the actual nvmet-tcp upcall addition?
>>
> I guess it can; it's just that it doesn't make any sense if we don't 
> have the TLS upcall implemented.

Still its mixing two parts together and makes the patch unnecessarily
complicated.

>>> @@ -1285,12 +1302,12 @@ static int nvmet_tcp_try_recv(struct 
>>> nvmet_tcp_queue *queue,
>>>   static void nvmet_tcp_schedule_release_queue(struct nvmet_tcp_queue 
>>> *queue)
>>>   {
>>> -    spin_lock(&queue->state_lock);
>>> +    spin_lock_irq(&queue->state_lock);
>>
>> Where is this lock taken in irq context that needs disabling irq?
>>
> Let me check; might be that it got solved with the workqueue fixes which 
> went in lately.

Can you explain what was the original issue for this change?

>>> +{
>>> +    struct nvmet_tcp_queue *queue = data;
>>> +
>>> +    pr_debug("queue %d: TLS handshake done, key %x, status %d\n",
>>> +         queue->idx, peerid, status);
>>> +    if (!status) {
>>> +        spin_lock_irq(&queue->state_lock);
>>> +        queue->tls_pskid = peerid;
>>> +        spin_unlock_irq(&queue->state_lock);
>>> +    }
>>> +    cancel_delayed_work_sync(&queue->tls_handshake_work);
>>
>> Hmm, the cancel_delayed_work_sync is scary.
>>
>> What happens if it ran and already scheduled a release (which
>> already ran and completed)?
>>
> Well, we need to stop the timeout at some point; granted, for a failure 
> we can just flush the workqueue, but in the success case we need 
> terminate the workqueue.
> And nvmet_tcp_schedule_release_queue() already checks for the queue 
> state, so we should be (reasonably) safe.

Still it looks error prone (although granted, I didn't spot a particular
race where this breaks).

I wish the handshake code could have given us a timeout functionality
so we don't need to deal with it...

In any event, preferably the synchronization does not require to
synchronize an active work element on the workqueue.

>>> +    pr_debug("queue %d: TLS ServerHello\n", queue->idx);
>>> +    memset(&args, 0, sizeof(args));
>>> +    args.ta_sock = queue->sock;
>>> +    args.ta_done = nvmet_tcp_tls_handshake_done;
>>> +    args.ta_data = queue;
>>> +    args.ta_keyring = key_serial(queue->port->nport->keyring);
>>> +    args.ta_timeout_ms = tls_handshake_timeout * 2 * 1024;
>>
>> * 2 * 1024 ? I didn't know we have 2048 ms in a second...
>>
> The '_ms' bit is just an indicator of the unit, not the value.
> We don't have an 'args.ta_timeout_sec' ...

I expected to see tls_handshake_timeout * 1000, i.e. a trivial
conversion from sec to ms.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 12/14] nvmet-tcp: allocate socket file
  2023-08-07 10:42           ` Sagi Grimberg
@ 2023-08-08  6:08             ` Hannes Reinecke
  2023-08-08  8:44               ` Sagi Grimberg
  0 siblings, 1 reply; 45+ messages in thread
From: Hannes Reinecke @ 2023-08-08  6:08 UTC (permalink / raw)
  To: Sagi Grimberg, Christoph Hellwig; +Cc: Keith Busch, linux-nvme

On 8/7/23 12:42, Sagi Grimberg wrote:
> 
>>>>>> -static int nvmet_tcp_alloc_queue(struct nvmet_tcp_port *port,
>>>>>> +static void nvmet_tcp_alloc_queue(struct nvmet_tcp_port *port,
>>>>>>           struct socket *newsock)
>>>>>
>>>>> Why does the function change from retcode to void in this patch?
>>>>
>>>> Because the return code was never evaluated.
>>>
>>> ??
>>>
>>> I see it is evaluated.
>>
>> Yes, but this patch moved the 'sock_release()' call into 
>> nvmet_tcp_alloc_queue(), making the return code obsolete.
>>
>> But I can make it a separate patch if required.
> 
> But now who calls sock_release if you failed before allocating
> the sockfile?
> 
That is done in nvmet_tcp_remove_port(), and independent on
queue release.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 13/14] nvmet-tcp: enable TLS handshake upcall
  2023-08-07 11:49       ` Sagi Grimberg
@ 2023-08-08  6:16         ` Hannes Reinecke
  0 siblings, 0 replies; 45+ messages in thread
From: Hannes Reinecke @ 2023-08-08  6:16 UTC (permalink / raw)
  To: Sagi Grimberg, Christoph Hellwig; +Cc: Keith Busch, linux-nvme

On 8/7/23 13:49, Sagi Grimberg wrote:
> 
>>>> @@ -389,6 +427,16 @@ static ssize_t nvmet_addr_tsas_store(struct 
>>>> config_item *item,
>>>>   found:
>>>>       nvmet_port_init_tsas_tcp(port, nvmet_addr_tsas_tcp[i].type);
>>>> +    if (nvmet_addr_tsas_tcp[i].type == NVMF_TCP_SECTYPE_TLS13) {
>>>> +        if (nvmet_port_treq(port) == NVMF_TREQ_NOT_SPECIFIED)
>>>> +            treq |= NVMF_TREQ_REQUIRED;
>>>> +        else
>>>> +            treq |= nvmet_port_treq(port);
>>>> +    } else {
>>>> +        /* Set to 'not specified' if TLS is not enabled */
>>>> +        treq |= NVMF_TREQ_NOT_SPECIFIED;
>>>> +    }
>>>> +    port->disc_addr.treq = treq;
>>>>       return count;
>>>>   }
>>>
>>> Can the treq/tsas be split from the actual nvmet-tcp upcall addition?
>>>
>> I guess it can; it's just that it doesn't make any sense if we don't 
>> have the TLS upcall implemented.
> 
> Still its mixing two parts together and makes the patch unnecessarily
> complicated.
> 
>>>> @@ -1285,12 +1302,12 @@ static int nvmet_tcp_try_recv(struct 
>>>> nvmet_tcp_queue *queue,
>>>>   static void nvmet_tcp_schedule_release_queue(struct 
>>>> nvmet_tcp_queue *queue)
>>>>   {
>>>> -    spin_lock(&queue->state_lock);
>>>> +    spin_lock_irq(&queue->state_lock);
>>>
>>> Where is this lock taken in irq context that needs disabling irq?
>>>
>> Let me check; might be that it got solved with the workqueue fixes 
>> which went in lately.
> 
> Can you explain what was the original issue for this change?
> 
The original issue was a lockdep warning when tearing down the target;
this got addressed by the patch 'nvmet-tcp: add new workqueue to 
suppress lockdep warning' from Guoqing Jiang.

>>>> +{
>>>> +    struct nvmet_tcp_queue *queue = data;
>>>> +
>>>> +    pr_debug("queue %d: TLS handshake done, key %x, status %d\n",
>>>> +         queue->idx, peerid, status);
>>>> +    if (!status) {
>>>> +        spin_lock_irq(&queue->state_lock);
>>>> +        queue->tls_pskid = peerid;
>>>> +        spin_unlock_irq(&queue->state_lock);
>>>> +    }
>>>> +    cancel_delayed_work_sync(&queue->tls_handshake_work);
>>>
>>> Hmm, the cancel_delayed_work_sync is scary.
>>>
>>> What happens if it ran and already scheduled a release (which
>>> already ran and completed)?
>>>
>> Well, we need to stop the timeout at some point; granted, for a 
>> failure we can just flush the workqueue, but in the success case we 
>> need terminate the workqueue.
>> And nvmet_tcp_schedule_release_queue() already checks for the queue 
>> state, so we should be (reasonably) safe.
> 
> Still it looks error prone (although granted, I didn't spot a particular
> race where this breaks).
> 
> I wish the handshake code could have given us a timeout functionality
> so we don't need to deal with it...
> 
> In any event, preferably the synchronization does not require to
> synchronize an active work element on the workqueue.
> 
With the timeout code we always will have two threads; the handshake 
itself and the timeout. So when handshake completes the timeout will
be running, and we will have to terminate it somehow. I really can't
see how we could avoid that.
But I do get your point; will be adding an atomic variable to 
synchronize between these two threads.

>>>> +    pr_debug("queue %d: TLS ServerHello\n", queue->idx);
>>>> +    memset(&args, 0, sizeof(args));
>>>> +    args.ta_sock = queue->sock;
>>>> +    args.ta_done = nvmet_tcp_tls_handshake_done;
>>>> +    args.ta_data = queue;
>>>> +    args.ta_keyring = key_serial(queue->port->nport->keyring);
>>>> +    args.ta_timeout_ms = tls_handshake_timeout * 2 * 1024;
>>>
>>> * 2 * 1024 ? I didn't know we have 2048 ms in a second...
>>>
>> The '_ms' bit is just an indicator of the unit, not the value.
>> We don't have an 'args.ta_timeout_sec' ...
> 
> I expected to see tls_handshake_timeout * 1000, i.e. a trivial
> conversion from sec to ms.

Okay, will do that.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 09/14] nvme-tcp: control message handling for recvmsg()
  2023-08-07  8:22   ` Sagi Grimberg
@ 2023-08-08  6:39     ` Hannes Reinecke
  2023-08-08  8:41       ` Sagi Grimberg
  0 siblings, 1 reply; 45+ messages in thread
From: Hannes Reinecke @ 2023-08-08  6:39 UTC (permalink / raw)
  To: Sagi Grimberg, Christoph Hellwig; +Cc: Keith Busch, linux-nvme

On 8/7/23 10:22, Sagi Grimberg wrote:
> 
> 
> On 8/3/23 13:50, Hannes Reinecke wrote:
>> kTLS is sending TLS ALERT messages as control messages for recvmsg().
>> As we can't do anything sensible with it just abort the connection
>> and let the userspace agent to a re-negotiation.
>>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
>> ---
>>   drivers/nvme/host/tcp.c | 18 +++++++++++++++++-
>>   1 file changed, 17 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
>> index 98400ad5f969..4d0e3de39c26 100644
>> --- a/drivers/nvme/host/tcp.c
>> +++ b/drivers/nvme/host/tcp.c
>> @@ -14,6 +14,7 @@
>>   #include <net/sock.h>
>>   #include <net/tcp.h>
>>   #include <net/tls.h>
>> +#include <net/tls_prot.h>
>>   #include <net/handshake.h>
>>   #include <linux/blk-mq.h>
>>   #include <crypto/hash.h>
>> @@ -1369,6 +1370,10 @@ static int nvme_tcp_init_connection(struct 
>> nvme_tcp_queue *queue)
>>   {
>>       struct nvme_tcp_icreq_pdu *icreq;
>>       struct nvme_tcp_icresp_pdu *icresp;
>> +#ifdef CONFIG_NVME_TCP_TLS
>> +    char cbuf[CMSG_LEN(sizeof(char))] = {};
>> +    u8 ctype;
>> +#endif
>>       struct msghdr msg = {};
>>       struct kvec iov;
>>       bool ctrl_hdgst, ctrl_ddgst;
>> @@ -1406,11 +1411,22 @@ static int nvme_tcp_init_connection(struct 
>> nvme_tcp_queue *queue)
>>       memset(&msg, 0, sizeof(msg));
>>       iov.iov_base = icresp;
>>       iov.iov_len = sizeof(*icresp);
>> +#ifdef CONFIG_NVME_TCP_TLS
>> +    msg.msg_control = cbuf;
>> +    msg.msg_controllen = sizeof(cbuf);
>> +#endif
>>       ret = kernel_recvmsg(queue->sock, &msg, &iov, 1,
>>               iov.iov_len, msg.msg_flags);
>>       if (ret < 0)
>>           goto free_icresp;
> 
> Is the cmsg passed unconditionally? meaning this does not
> impact non-tls?
> 
> If so, why is the ifdef needed?
> 
Passing in a cmsg (and a controllen) always enables the use of
control messages, irrespective of the type.
As we never enabled control messages prior to this we would
change the implementation when TLS is not enabled; who knows,
someone might have been sending us control messages all along,
but we never checked. So I thought it prudent to only enable
it for TLS where we know that we might be getting control messages.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 09/14] nvme-tcp: control message handling for recvmsg()
  2023-08-08  6:39     ` Hannes Reinecke
@ 2023-08-08  8:41       ` Sagi Grimberg
  2023-08-08  8:51         ` Hannes Reinecke
  0 siblings, 1 reply; 45+ messages in thread
From: Sagi Grimberg @ 2023-08-08  8:41 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig; +Cc: Keith Busch, linux-nvme



On 8/8/23 09:39, Hannes Reinecke wrote:
> On 8/7/23 10:22, Sagi Grimberg wrote:
>>
>>
>> On 8/3/23 13:50, Hannes Reinecke wrote:
>>> kTLS is sending TLS ALERT messages as control messages for recvmsg().
>>> As we can't do anything sensible with it just abort the connection
>>> and let the userspace agent to a re-negotiation.
>>>
>>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>>> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
>>> ---
>>>   drivers/nvme/host/tcp.c | 18 +++++++++++++++++-
>>>   1 file changed, 17 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
>>> index 98400ad5f969..4d0e3de39c26 100644
>>> --- a/drivers/nvme/host/tcp.c
>>> +++ b/drivers/nvme/host/tcp.c
>>> @@ -14,6 +14,7 @@
>>>   #include <net/sock.h>
>>>   #include <net/tcp.h>
>>>   #include <net/tls.h>
>>> +#include <net/tls_prot.h>
>>>   #include <net/handshake.h>
>>>   #include <linux/blk-mq.h>
>>>   #include <crypto/hash.h>
>>> @@ -1369,6 +1370,10 @@ static int nvme_tcp_init_connection(struct 
>>> nvme_tcp_queue *queue)
>>>   {
>>>       struct nvme_tcp_icreq_pdu *icreq;
>>>       struct nvme_tcp_icresp_pdu *icresp;
>>> +#ifdef CONFIG_NVME_TCP_TLS
>>> +    char cbuf[CMSG_LEN(sizeof(char))] = {};
>>> +    u8 ctype;
>>> +#endif
>>>       struct msghdr msg = {};
>>>       struct kvec iov;
>>>       bool ctrl_hdgst, ctrl_ddgst;
>>> @@ -1406,11 +1411,22 @@ static int nvme_tcp_init_connection(struct 
>>> nvme_tcp_queue *queue)
>>>       memset(&msg, 0, sizeof(msg));
>>>       iov.iov_base = icresp;
>>>       iov.iov_len = sizeof(*icresp);
>>> +#ifdef CONFIG_NVME_TCP_TLS
>>> +    msg.msg_control = cbuf;
>>> +    msg.msg_controllen = sizeof(cbuf);
>>> +#endif
>>>       ret = kernel_recvmsg(queue->sock, &msg, &iov, 1,
>>>               iov.iov_len, msg.msg_flags);
>>>       if (ret < 0)
>>>           goto free_icresp;
>>
>> Is the cmsg passed unconditionally? meaning this does not
>> impact non-tls?
>>
>> If so, why is the ifdef needed?
>>
> Passing in a cmsg (and a controllen) always enables the use of
> control messages, irrespective of the type.
> As we never enabled control messages prior to this we would
> change the implementation when TLS is not enabled; who knows,
> someone might have been sending us control messages all along,
> but we never checked. So I thought it prudent to only enable
> it for TLS where we know that we might be getting control messages.

But you are doing this only based on the compile-time config option, not
based on a runtime setting. I am not sure I follow your logic.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 12/14] nvmet-tcp: allocate socket file
  2023-08-08  6:08             ` Hannes Reinecke
@ 2023-08-08  8:44               ` Sagi Grimberg
  0 siblings, 0 replies; 45+ messages in thread
From: Sagi Grimberg @ 2023-08-08  8:44 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig; +Cc: Keith Busch, linux-nvme


>>>>>>> -static int nvmet_tcp_alloc_queue(struct nvmet_tcp_port *port,
>>>>>>> +static void nvmet_tcp_alloc_queue(struct nvmet_tcp_port *port,
>>>>>>>           struct socket *newsock)
>>>>>>
>>>>>> Why does the function change from retcode to void in this patch?
>>>>>
>>>>> Because the return code was never evaluated.
>>>>
>>>> ??
>>>>
>>>> I see it is evaluated.
>>>
>>> Yes, but this patch moved the 'sock_release()' call into 
>>> nvmet_tcp_alloc_queue(), making the return code obsolete.
>>>
>>> But I can make it a separate patch if required.
>>
>> But now who calls sock_release if you failed before allocating
>> the sockfile?
>>
> That is done in nvmet_tcp_remove_port(), and independent on
> queue release.

I'm referring to newsock, not the listener socket.

--
static void nvmet_tcp_alloc_queue(struct nvmet_tcp_port *port,
                 struct socket *newsock)
{
         struct nvmet_tcp_queue *queue;
         int ret;

         queue = kzalloc(sizeof(*queue), GFP_KERNEL);
	//i.e. you fail here...
         if (!queue)
                 return;
--


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 09/14] nvme-tcp: control message handling for recvmsg()
  2023-08-08  8:41       ` Sagi Grimberg
@ 2023-08-08  8:51         ` Hannes Reinecke
  2023-08-08  9:05           ` Sagi Grimberg
  0 siblings, 1 reply; 45+ messages in thread
From: Hannes Reinecke @ 2023-08-08  8:51 UTC (permalink / raw)
  To: Sagi Grimberg, Christoph Hellwig; +Cc: Keith Busch, linux-nvme

On 8/8/23 10:41, Sagi Grimberg wrote:
> 
> 
> On 8/8/23 09:39, Hannes Reinecke wrote:
>> On 8/7/23 10:22, Sagi Grimberg wrote:
>>>
>>>
>>> On 8/3/23 13:50, Hannes Reinecke wrote:
>>>> kTLS is sending TLS ALERT messages as control messages for recvmsg().
>>>> As we can't do anything sensible with it just abort the connection
>>>> and let the userspace agent to a re-negotiation.
>>>>
>>>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>>>> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
>>>> ---
>>>>   drivers/nvme/host/tcp.c | 18 +++++++++++++++++-
>>>>   1 file changed, 17 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
>>>> index 98400ad5f969..4d0e3de39c26 100644
>>>> --- a/drivers/nvme/host/tcp.c
>>>> +++ b/drivers/nvme/host/tcp.c
>>>> @@ -14,6 +14,7 @@
>>>>   #include <net/sock.h>
>>>>   #include <net/tcp.h>
>>>>   #include <net/tls.h>
>>>> +#include <net/tls_prot.h>
>>>>   #include <net/handshake.h>
>>>>   #include <linux/blk-mq.h>
>>>>   #include <crypto/hash.h>
>>>> @@ -1369,6 +1370,10 @@ static int nvme_tcp_init_connection(struct 
>>>> nvme_tcp_queue *queue)
>>>>   {
>>>>       struct nvme_tcp_icreq_pdu *icreq;
>>>>       struct nvme_tcp_icresp_pdu *icresp;
>>>> +#ifdef CONFIG_NVME_TCP_TLS
>>>> +    char cbuf[CMSG_LEN(sizeof(char))] = {};
>>>> +    u8 ctype;
>>>> +#endif
>>>>       struct msghdr msg = {};
>>>>       struct kvec iov;
>>>>       bool ctrl_hdgst, ctrl_ddgst;
>>>> @@ -1406,11 +1411,22 @@ static int nvme_tcp_init_connection(struct 
>>>> nvme_tcp_queue *queue)
>>>>       memset(&msg, 0, sizeof(msg));
>>>>       iov.iov_base = icresp;
>>>>       iov.iov_len = sizeof(*icresp);
>>>> +#ifdef CONFIG_NVME_TCP_TLS
>>>> +    msg.msg_control = cbuf;
>>>> +    msg.msg_controllen = sizeof(cbuf);
>>>> +#endif
>>>>       ret = kernel_recvmsg(queue->sock, &msg, &iov, 1,
>>>>               iov.iov_len, msg.msg_flags);
>>>>       if (ret < 0)
>>>>           goto free_icresp;
>>>
>>> Is the cmsg passed unconditionally? meaning this does not
>>> impact non-tls?
>>>
>>> If so, why is the ifdef needed?
>>>
>> Passing in a cmsg (and a controllen) always enables the use of
>> control messages, irrespective of the type.
>> As we never enabled control messages prior to this we would
>> change the implementation when TLS is not enabled; who knows,
>> someone might have been sending us control messages all along,
>> but we never checked. So I thought it prudent to only enable
>> it for TLS where we know that we might be getting control messages.
> 
> But you are doing this only based on the compile-time config option, not
> based on a runtime setting. I am not sure I follow your logic.

For TLS I actually check the cmsg type, so even if we were to receive 
other control messages they would be caught.
If I were to enable control messages in general I would need to check 
for control messages on every packet, which may well impact performance.
Do we want that?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		           Kernel Storage Architect
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Frankenstr. 146, 90461 Nürnberg
Managing Directors: I. Totev, A. Myers, A. McDonald, M. B. Moerman
(HRB 36809, AG Nürnberg)



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 09/14] nvme-tcp: control message handling for recvmsg()
  2023-08-08  8:51         ` Hannes Reinecke
@ 2023-08-08  9:05           ` Sagi Grimberg
  2023-08-08 10:57             ` Pawel Baldysiak
       [not found]             ` <20230808105403.3949653-1-pawel.baldysiak@dell.com>
  0 siblings, 2 replies; 45+ messages in thread
From: Sagi Grimberg @ 2023-08-08  9:05 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig; +Cc: Keith Busch, linux-nvme


>>>> Is the cmsg passed unconditionally? meaning this does not
>>>> impact non-tls?
>>>>
>>>> If so, why is the ifdef needed?
>>>>
>>> Passing in a cmsg (and a controllen) always enables the use of
>>> control messages, irrespective of the type.
>>> As we never enabled control messages prior to this we would
>>> change the implementation when TLS is not enabled; who knows,
>>> someone might have been sending us control messages all along,
>>> but we never checked. So I thought it prudent to only enable
>>> it for TLS where we know that we might be getting control messages.
>>
>> But you are doing this only based on the compile-time config option, not
>> based on a runtime setting. I am not sure I follow your logic.
> 
> For TLS I actually check the cmsg type, so even if we were to receive 
> other control messages they would be caught.
> If I were to enable control messages in general I would need to check 
> for control messages on every packet, which may well impact performance.
> Do we want that?

Hannes, still not following.

Lets assume that CONFIG_NVME_TCP_TLS=y, like it will usually be
by default for most server systems on the planet (where nvmet is
expected to run).

Please explain how you do not affect non-tls workloads? Or what is
the difference? I am not following your logic here.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 09/14] nvme-tcp: control message handling for recvmsg()
  2023-08-08  9:05           ` Sagi Grimberg
@ 2023-08-08 10:57             ` Pawel Baldysiak
       [not found]             ` <20230808105403.3949653-1-pawel.baldysiak@dell.com>
  1 sibling, 0 replies; 45+ messages in thread
From: Pawel Baldysiak @ 2023-08-08 10:57 UTC (permalink / raw)
  To: sagi; +Cc: hare, hch, kbusch, linux-nvme

Hi all,

I'm testing this patchset and observe the issue with above.
It's not only about passing cbuf, but the check below.
When I would like to connect/discover without TLS, 
I got an error about unhandled TLS record.
tls_get_record_type() simply returns 0 for non-TLS,
so it breaks non-tls connections here.

Thanks
Pawel Baldysiak


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 09/14] nvme-tcp: control message handling for recvmsg()
       [not found]             ` <20230808105403.3949653-1-pawel.baldysiak@dell.com>
@ 2023-08-08 11:45               ` Sagi Grimberg
  2023-08-08 11:56                 ` Hannes Reinecke
  0 siblings, 1 reply; 45+ messages in thread
From: Sagi Grimberg @ 2023-08-08 11:45 UTC (permalink / raw)
  To: Pawel Baldysiak, Hannes Reinecke; +Cc: hare, hch, kbusch, linux-nvme


> Hi all,
> 
> I'm testing this patchset and observe the issue with above.
> It's not only about passing cbuf, but the check below.
> When I would like to connect/discover without TLS,
> I got an error about unhandled TLS record.
> tls_get_record_type() simply returns 0 for non-TLS,
> so it breaks non-tls connections here.

Hannes?

probably need to make sure to test without tls (sweep
blktests should be enough).


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 09/14] nvme-tcp: control message handling for recvmsg()
  2023-08-08 11:45               ` Sagi Grimberg
@ 2023-08-08 11:56                 ` Hannes Reinecke
  0 siblings, 0 replies; 45+ messages in thread
From: Hannes Reinecke @ 2023-08-08 11:56 UTC (permalink / raw)
  To: Sagi Grimberg, Pawel Baldysiak; +Cc: hch, kbusch, linux-nvme

On 8/8/23 13:45, Sagi Grimberg wrote:
> 
>> Hi all,
>>
>> I'm testing this patchset and observe the issue with above.
>> It's not only about passing cbuf, but the check below.
>> When I would like to connect/discover without TLS,
>> I got an error about unhandled TLS record.
>> tls_get_record_type() simply returns 0 for non-TLS,
>> so it breaks non-tls connections here.
> 
> Hannes?
> 
> probably need to make sure to test without tls (sweep
> blktests should be enough).

Yeah, will do.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		           Kernel Storage Architect
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Frankenstr. 146, 90461 Nürnberg
Managing Directors: I. Totev, A. Myers, A. McDonald, M. B. Moerman
(HRB 36809, AG Nürnberg)



^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2023-08-08 11:56 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-03 10:50 [PATCHv5 00/14] nvme: In-kernel TLS support for TCP Hannes Reinecke
2023-08-03 10:50 ` [PATCH 01/14] nvme-keyring: register '.nvme' keyring Hannes Reinecke
2023-08-07  7:09   ` Sagi Grimberg
2023-08-03 10:50 ` [PATCH 02/14] nvme-keyring: define a 'psk' keytype Hannes Reinecke
2023-08-07  7:11   ` Sagi Grimberg
2023-08-03 10:50 ` [PATCH 03/14] nvme: add TCP TSAS definitions Hannes Reinecke
2023-08-03 10:50 ` [PATCH 04/14] nvme-tcp: add definitions for TLS cipher suites Hannes Reinecke
2023-08-03 10:50 ` [PATCH 05/14] nvme-keyring: implement nvme_tls_psk_default() Hannes Reinecke
2023-08-07  7:13   ` Sagi Grimberg
2023-08-03 10:50 ` [PATCH 06/14] security/keys: export key_lookup() Hannes Reinecke
2023-08-07  7:13   ` Sagi Grimberg
2023-08-03 10:50 ` [PATCH 07/14] nvme/tcp: allocate socket file Hannes Reinecke
2023-08-07  7:15   ` Sagi Grimberg
2023-08-07  7:23     ` Hannes Reinecke
2023-08-03 10:50 ` [PATCH 08/14] nvme-tcp: enable TLS handshake upcall Hannes Reinecke
2023-08-07  8:20   ` Sagi Grimberg
2023-08-07  8:32     ` Hannes Reinecke
2023-08-03 10:50 ` [PATCH 09/14] nvme-tcp: control message handling for recvmsg() Hannes Reinecke
2023-08-07  8:22   ` Sagi Grimberg
2023-08-08  6:39     ` Hannes Reinecke
2023-08-08  8:41       ` Sagi Grimberg
2023-08-08  8:51         ` Hannes Reinecke
2023-08-08  9:05           ` Sagi Grimberg
2023-08-08 10:57             ` Pawel Baldysiak
     [not found]             ` <20230808105403.3949653-1-pawel.baldysiak@dell.com>
2023-08-08 11:45               ` Sagi Grimberg
2023-08-08 11:56                 ` Hannes Reinecke
2023-08-03 10:50 ` [PATCH 10/14] nvme-fabrics: parse options 'keyring' and 'tls_key' Hannes Reinecke
2023-08-07  8:23   ` Sagi Grimberg
2023-08-07  8:34     ` Hannes Reinecke
2023-08-03 10:50 ` [PATCH 11/14] nvmet: make TCP sectype settable via configfs Hannes Reinecke
2023-08-07  8:25   ` Sagi Grimberg
2023-08-03 10:51 ` [PATCH 12/14] nvmet-tcp: allocate socket file Hannes Reinecke
2023-08-07  8:27   ` Sagi Grimberg
2023-08-07  8:49     ` Hannes Reinecke
2023-08-07  8:53       ` Sagi Grimberg
2023-08-07  9:17         ` Hannes Reinecke
2023-08-07 10:42           ` Sagi Grimberg
2023-08-08  6:08             ` Hannes Reinecke
2023-08-08  8:44               ` Sagi Grimberg
2023-08-03 10:51 ` [PATCH 13/14] nvmet-tcp: enable TLS handshake upcall Hannes Reinecke
2023-08-07  8:51   ` Sagi Grimberg
2023-08-07  9:15     ` Hannes Reinecke
2023-08-07 11:49       ` Sagi Grimberg
2023-08-08  6:16         ` Hannes Reinecke
2023-08-03 10:51 ` [PATCH 14/14] nvmet-tcp: control messages for recvmsg() Hannes Reinecke

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.