All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v8 0/4] Another crack at a handshake upcall mechanism
@ 2023-04-03 18:45 Chuck Lever
  2023-04-03 18:46 ` [PATCH v8 1/4] net/handshake: Create a NETLINK service for handling handshake requests Chuck Lever
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Chuck Lever @ 2023-04-03 18:45 UTC (permalink / raw)
  To: kuba, pabeni, edumazet, borisp; +Cc: netdev, kernel-tls-handshake, john.haxby

Hi-

Here is v8 of a series to add generic support for transport layer
security handshake on behalf of kernel socket consumers (user space
consumers use a security library directly, of course). A summary of
the purpose of these patches is archived here:

https://lore.kernel.org/netdev/1DE06BB1-6BA9-4DB4-B2AA-07DE532963D6@oracle.com/

The full patch set to support SunRPC with TLSv1.3 is available in
the topic-rpc-with-tls-upcall branch here, based on net-next/main:

https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git

This patch set includes support for in-transit confidentiality and
peer authentication for both the Linux NFS client and server.

A user space handshake agent for TLSv1.3 to go along with the kernel
patches is available in the "main" branch here:

https://github.com/oracle/ktls-utils

---

Major changes since v7:
- Addressed Paolo's v7 review comments
- Added initial set of Kunit tests for the handshake API
- Included an NFS server patch to add new TLS_RECORD_TYPE values

Major changes since v6:
- YAML spec and generated artifacts are now under dual license
- Addressed Jakub's v6 review comments
- Implemented a memory-sensitive limit on the number of pending
  handshake requests
- Implemented upcall support for multiple peer identities

Major changes since v5:
- Added a "timeout" attribute to the handshake netlink protocol
- Removed the GnuTLS-specific "priorities" attribute
- Added support for keyrings to restrict access to keys
- Simplified the kernel consumer TLS handshake API
- The handshake netlink protocol can handle multiple peer IDs or
  certificates in the ACCEPT and DONE operations, though the
  implementation does not yet support it.

Major changes since v4:
- Rebased onto net-next/main
- Replaced req reference counting with ->sk_destruct
- CMD_ACCEPT now does the equivalent of a dup(2) rather than an
  accept(2)
- CMD_DONE no longer closes the user space socket endpoint
- handshake_req_cancel is now tested and working
- Added a YAML specification for the netlink upcall protocol, and
  simplified the protocol to fit the YAML schema
- Added an initial set of tracepoints

Changes since v3:
- Converted all netlink code to use Generic Netlink
- Reworked handshake request lifetime logic throughout
- Global pending list is now per-net
- On completion, return the remote's identity to the consumer

Changes since v2:
- PF_HANDSHAKE replaced with NETLINK_HANDSHAKE
- Replaced listen(2) / poll(2) with a multicast notification service
- Replaced accept(2) with a netlink operation that can return an
  open fd and handshake parameters
- Replaced close(2) with a netlink operation that can take arguments

Changes since RFC:
- Generic upcall support split away from kTLS
- Added support for TLS ServerHello
- Documentation has been temporarily removed while API churns

---

Chuck Lever (4):
      net/handshake: Create a NETLINK service for handling handshake requests
      net/handshake: Add a kernel API for requesting a TLSv1.3 handshake
      net/handshake: Add Kunit tests for the handshake consumer API
      SUNRPC: Recognize control messages in server-side TCP socket code


 Documentation/netlink/specs/handshake.yaml | 124 +++++
 Documentation/networking/index.rst         |   1 +
 Documentation/networking/tls-handshake.rst | 217 +++++++++
 MAINTAINERS                                |  11 +
 include/net/handshake.h                    |  43 ++
 include/net/tls.h                          |   2 +
 include/trace/events/handshake.h           | 159 +++++++
 include/trace/events/sunrpc.h              |  39 ++
 include/uapi/linux/handshake.h             |  73 +++
 net/Kconfig                                |  20 +
 net/Makefile                               |   1 +
 net/handshake/Makefile                     |  13 +
 net/handshake/genl.c                       |  58 +++
 net/handshake/genl.h                       |  24 +
 net/handshake/handshake.h                  |  81 ++++
 net/handshake/kunit-test.c                 | 523 +++++++++++++++++++++
 net/handshake/netlink.c                    | 327 +++++++++++++
 net/handshake/request.c                    | 344 ++++++++++++++
 net/handshake/tlshd.c                      | 417 ++++++++++++++++
 net/handshake/trace.c                      |  20 +
 net/sunrpc/svcsock.c                       |  49 +-
 21 files changed, 2544 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/netlink/specs/handshake.yaml
 create mode 100644 Documentation/networking/tls-handshake.rst
 create mode 100644 include/net/handshake.h
 create mode 100644 include/trace/events/handshake.h
 create mode 100644 include/uapi/linux/handshake.h
 create mode 100644 net/handshake/Makefile
 create mode 100644 net/handshake/genl.c
 create mode 100644 net/handshake/genl.h
 create mode 100644 net/handshake/handshake.h
 create mode 100644 net/handshake/kunit-test.c
 create mode 100644 net/handshake/netlink.c
 create mode 100644 net/handshake/request.c
 create mode 100644 net/handshake/tlshd.c
 create mode 100644 net/handshake/trace.c

--
Chuck Lever


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v8 1/4] net/handshake: Create a NETLINK service for handling handshake requests
  2023-04-03 18:45 [PATCH v8 0/4] Another crack at a handshake upcall mechanism Chuck Lever
@ 2023-04-03 18:46 ` Chuck Lever
  2023-04-04 15:36   ` Chuck Lever III
  2023-04-05  0:23   ` Jakub Kicinski
  2023-04-03 18:46 ` [PATCH v8 2/4] net/handshake: Add a kernel API for requesting a TLSv1.3 handshake Chuck Lever
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 16+ messages in thread
From: Chuck Lever @ 2023-04-03 18:46 UTC (permalink / raw)
  To: kuba, pabeni, edumazet, borisp; +Cc: netdev, kernel-tls-handshake, john.haxby

From: Chuck Lever <chuck.lever@oracle.com>

When a kernel consumer needs a transport layer security session, it
first needs a handshake to negotiate and establish a session. This
negotiation can be done in user space via one of the several
existing library implementations, or it can be done in the kernel.

No in-kernel handshake implementations yet exist. In their absence,
we add a netlink service that can:

a. Notify a user space daemon that a handshake is needed.

b. Once notified, the daemon calls the kernel back via this
   netlink service to get the handshake parameters, including an
   open socket on which to establish the session.

c. Once the handshake is complete, the daemon reports the
   session status and other information via a second netlink
   operation. This operation marks that it is safe for the
   kernel to use the open socket and the security session
   established there.

The notification service uses a multicast group. Each handshake
mechanism (eg, tlshd) adopts its own group number so that the
handshake services are completely independent of one another. The
kernel can then tell via netlink_has_listeners() whether a handshake
service is active and prepared to handle a handshake request.

A new netlink operation, ACCEPT, acts like accept(2) in that it
instantiates a file descriptor in the user space daemon's fd table.
If this operation is successful, the reply carries the fd number,
which can be treated as an open and ready file descriptor.

While user space is performing the handshake, the kernel keeps its
muddy paws off the open socket. A second new netlink operation,
DONE, indicates that the user space daemon is finished with the
socket and it is safe for the kernel to use again. The operation
also indicates whether a session was established successfully.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 Documentation/netlink/specs/handshake.yaml |  122 ++++++++++
 MAINTAINERS                                |    9 +
 include/trace/events/handshake.h           |  159 +++++++++++++
 include/uapi/linux/handshake.h             |   71 ++++++
 net/Kconfig                                |    5 
 net/Makefile                               |    1 
 net/handshake/Makefile                     |   11 +
 net/handshake/genl.c                       |   57 +++++
 net/handshake/genl.h                       |   23 ++
 net/handshake/handshake.h                  |   81 +++++++
 net/handshake/netlink.c                    |  308 +++++++++++++++++++++++++
 net/handshake/request.c                    |  344 ++++++++++++++++++++++++++++
 net/handshake/trace.c                      |   20 ++
 13 files changed, 1211 insertions(+)
 create mode 100644 Documentation/netlink/specs/handshake.yaml
 create mode 100644 include/trace/events/handshake.h
 create mode 100644 include/uapi/linux/handshake.h
 create mode 100644 net/handshake/Makefile
 create mode 100644 net/handshake/genl.c
 create mode 100644 net/handshake/genl.h
 create mode 100644 net/handshake/handshake.h
 create mode 100644 net/handshake/netlink.c
 create mode 100644 net/handshake/request.c
 create mode 100644 net/handshake/trace.c

diff --git a/Documentation/netlink/specs/handshake.yaml b/Documentation/netlink/specs/handshake.yaml
new file mode 100644
index 000000000000..0333d92b1438
--- /dev/null
+++ b/Documentation/netlink/specs/handshake.yaml
@@ -0,0 +1,122 @@
+# SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)
+#
+# Author: Chuck Lever <chuck.lever@oracle.com>
+#
+# Copyright (c) 2023, Oracle and/or its affiliates.
+#
+
+name: handshake
+
+protocol: genetlink
+
+doc: Netlink protocol to request a transport layer security handshake.
+
+definitions:
+  -
+    type: enum
+    name: handler-class
+    value-start: 0
+    entries: [ none, max ]
+  -
+    type: enum
+    name: msg-type
+    value-start: 0
+    entries: [ unspec, clienthello, serverhello ]
+  -
+    type: enum
+    name: auth
+    value-start: 0
+    entries: [ unspec, unauth, psk, x509 ]
+
+attribute-sets:
+  -
+    name: x509
+    attributes:
+      -
+        name: cert
+        type: u32
+      -
+        name: privkey
+        type: u32
+  -
+    name: accept
+    attributes:
+      -
+        name: sockfd
+        type: u32
+      -
+        name: handler-class
+        type: u32
+        enum: handler-class
+      -
+        name: message-type
+        type: u32
+        enum: msg-type
+      -
+        name: timeout
+        type: u32
+      -
+        name: auth-mode
+        type: u32
+        enum: auth
+      -
+        name: peer-identity
+        type: u32
+        multi-attr: true
+      -
+        name: certificate
+        type: nest
+        nested-attributes: x509
+        multi-attr: true
+  -
+    name: done
+    attributes:
+      -
+        name: status
+        type: u32
+      -
+        name: sockfd
+        type: u32
+      -
+        name: remote-auth
+        type: u32
+        multi-attr: true
+
+operations:
+  list:
+    -
+      name: ready
+      doc: Notify handlers that a new handshake request is waiting
+      notify: accept
+    -
+      name: accept
+      doc: Handler retrieves next queued handshake request
+      attribute-set: accept
+      flags: [ admin-perm ]
+      do:
+        request:
+          attributes:
+            - handler-class
+        reply:
+          attributes:
+            - sockfd
+            - message-type
+            - timeout
+            - auth-mode
+            - peer-identity
+            - certificate
+    -
+      name: done
+      doc: Handler reports handshake completion
+      attribute-set: done
+      do:
+        request:
+          attributes:
+            - status
+            - sockfd
+            - remote-auth
+
+mcast-groups:
+  list:
+    -
+      name: none
diff --git a/MAINTAINERS b/MAINTAINERS
index 7812f0e251ad..d78d67ac7f89 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8940,6 +8940,15 @@ Q:	http://patchwork.linuxtv.org/project/linux-media/list/
 T:	git git://linuxtv.org/anttip/media_tree.git
 F:	drivers/media/usb/hackrf/
 
+HANDSHAKE UPCALL FOR TRANSPORT LAYER SECURITY
+M:	Chuck Lever <chuck.lever@oracle.com>
+L:	kernel-tls-handshake@lists.linux.dev
+L:	netdev@vger.kernel.org
+S:	Maintained
+F:	Documentation/netlink/specs/handshake.yaml
+F:	include/trace/events/handshake.h
+F:	net/handshake/
+
 HANTRO VPU CODEC DRIVER
 M:	Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>
 M:	Philipp Zabel <p.zabel@pengutronix.de>
diff --git a/include/trace/events/handshake.h b/include/trace/events/handshake.h
new file mode 100644
index 000000000000..97f0f160393b
--- /dev/null
+++ b/include/trace/events/handshake.h
@@ -0,0 +1,159 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM handshake
+
+#if !defined(_TRACE_HANDSHAKE_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_HANDSHAKE_H
+
+#include <linux/net.h>
+#include <linux/tracepoint.h>
+
+DECLARE_EVENT_CLASS(handshake_event_class,
+	TP_PROTO(
+		const struct net *net,
+		const struct handshake_req *req,
+		const struct sock *sk
+	),
+	TP_ARGS(net, req, sk),
+	TP_STRUCT__entry(
+		__field(const void *, req)
+		__field(const void *, sk)
+		__field(unsigned int, netns_ino)
+	),
+	TP_fast_assign(
+		__entry->req = req;
+		__entry->sk = sk;
+		__entry->netns_ino = net->ns.inum;
+	),
+	TP_printk("req=%p sk=%p",
+		__entry->req, __entry->sk
+	)
+);
+#define DEFINE_HANDSHAKE_EVENT(name)				\
+	DEFINE_EVENT(handshake_event_class, name,		\
+		TP_PROTO(					\
+			const struct net *net,			\
+			const struct handshake_req *req,	\
+			const struct sock *sk			\
+		),						\
+		TP_ARGS(net, req, sk))
+
+DECLARE_EVENT_CLASS(handshake_fd_class,
+	TP_PROTO(
+		const struct net *net,
+		const struct handshake_req *req,
+		const struct sock *sk,
+		int fd
+	),
+	TP_ARGS(net, req, sk, fd),
+	TP_STRUCT__entry(
+		__field(const void *, req)
+		__field(const void *, sk)
+		__field(int, fd)
+		__field(unsigned int, netns_ino)
+	),
+	TP_fast_assign(
+		__entry->req = req;
+		__entry->sk = req->hr_sk;
+		__entry->fd = fd;
+		__entry->netns_ino = net->ns.inum;
+	),
+	TP_printk("req=%p sk=%p fd=%d",
+		__entry->req, __entry->sk, __entry->fd
+	)
+);
+#define DEFINE_HANDSHAKE_FD_EVENT(name)				\
+	DEFINE_EVENT(handshake_fd_class, name,			\
+		TP_PROTO(					\
+			const struct net *net,			\
+			const struct handshake_req *req,	\
+			const struct sock *sk,			\
+			int fd					\
+		),						\
+		TP_ARGS(net, req, sk, fd))
+
+DECLARE_EVENT_CLASS(handshake_error_class,
+	TP_PROTO(
+		const struct net *net,
+		const struct handshake_req *req,
+		const struct sock *sk,
+		int err
+	),
+	TP_ARGS(net, req, sk, err),
+	TP_STRUCT__entry(
+		__field(const void *, req)
+		__field(const void *, sk)
+		__field(int, err)
+		__field(unsigned int, netns_ino)
+	),
+	TP_fast_assign(
+		__entry->req = req;
+		__entry->sk = sk;
+		__entry->err = err;
+		__entry->netns_ino = net->ns.inum;
+	),
+	TP_printk("req=%p sk=%p err=%d",
+		__entry->req, __entry->sk, __entry->err
+	)
+);
+#define DEFINE_HANDSHAKE_ERROR(name)				\
+	DEFINE_EVENT(handshake_error_class, name,		\
+		TP_PROTO(					\
+			const struct net *net,			\
+			const struct handshake_req *req,	\
+			const struct sock *sk,			\
+			int err					\
+		),						\
+		TP_ARGS(net, req, sk, err))
+
+
+/**
+ ** Request lifetime events
+ **/
+
+DEFINE_HANDSHAKE_EVENT(handshake_submit);
+DEFINE_HANDSHAKE_ERROR(handshake_submit_err);
+DEFINE_HANDSHAKE_EVENT(handshake_cancel);
+DEFINE_HANDSHAKE_EVENT(handshake_cancel_none);
+DEFINE_HANDSHAKE_EVENT(handshake_cancel_busy);
+DEFINE_HANDSHAKE_EVENT(handshake_destruct);
+
+
+TRACE_EVENT(handshake_complete,
+	TP_PROTO(
+		const struct net *net,
+		const struct handshake_req *req,
+		const struct sock *sk,
+		int status
+	),
+	TP_ARGS(net, req, sk, status),
+	TP_STRUCT__entry(
+		__field(const void *, req)
+		__field(const void *, sk)
+		__field(int, status)
+		__field(unsigned int, netns_ino)
+	),
+	TP_fast_assign(
+		__entry->req = req;
+		__entry->sk = sk;
+		__entry->status = status;
+		__entry->netns_ino = net->ns.inum;
+	),
+	TP_printk("req=%p sk=%p status=%d",
+		__entry->req, __entry->sk, __entry->status
+	)
+);
+
+/**
+ ** Netlink events
+ **/
+
+DEFINE_HANDSHAKE_ERROR(handshake_notify_err);
+DEFINE_HANDSHAKE_FD_EVENT(handshake_cmd_accept);
+DEFINE_HANDSHAKE_ERROR(handshake_cmd_accept_err);
+DEFINE_HANDSHAKE_FD_EVENT(handshake_cmd_done);
+DEFINE_HANDSHAKE_ERROR(handshake_cmd_done_err);
+
+#endif /* _TRACE_HANDSHAKE_H */
+
+#include <trace/define_trace.h>
diff --git a/include/uapi/linux/handshake.h b/include/uapi/linux/handshake.h
new file mode 100644
index 000000000000..7f66ff489b87
--- /dev/null
+++ b/include/uapi/linux/handshake.h
@@ -0,0 +1,71 @@
+/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) */
+/* Do not edit directly, auto-generated from: */
+/*	Documentation/netlink/specs/handshake.yaml */
+/* YNL-GEN uapi header */
+
+#ifndef _UAPI_LINUX_HANDSHAKE_H
+#define _UAPI_LINUX_HANDSHAKE_H
+
+#define HANDSHAKE_FAMILY_NAME		"handshake"
+#define HANDSHAKE_FAMILY_VERSION	1
+
+enum handshake_handler_class {
+	HANDSHAKE_HANDLER_CLASS_NONE,
+	HANDSHAKE_HANDLER_CLASS_MAX,
+};
+
+enum handshake_msg_type {
+	HANDSHAKE_MSG_TYPE_UNSPEC,
+	HANDSHAKE_MSG_TYPE_CLIENTHELLO,
+	HANDSHAKE_MSG_TYPE_SERVERHELLO,
+};
+
+enum handshake_auth {
+	HANDSHAKE_AUTH_UNSPEC,
+	HANDSHAKE_AUTH_UNAUTH,
+	HANDSHAKE_AUTH_PSK,
+	HANDSHAKE_AUTH_X509,
+};
+
+enum {
+	HANDSHAKE_A_X509_CERT = 1,
+	HANDSHAKE_A_X509_PRIVKEY,
+
+	__HANDSHAKE_A_X509_MAX,
+	HANDSHAKE_A_X509_MAX = (__HANDSHAKE_A_X509_MAX - 1)
+};
+
+enum {
+	HANDSHAKE_A_ACCEPT_SOCKFD = 1,
+	HANDSHAKE_A_ACCEPT_HANDLER_CLASS,
+	HANDSHAKE_A_ACCEPT_MESSAGE_TYPE,
+	HANDSHAKE_A_ACCEPT_TIMEOUT,
+	HANDSHAKE_A_ACCEPT_AUTH_MODE,
+	HANDSHAKE_A_ACCEPT_PEER_IDENTITY,
+	HANDSHAKE_A_ACCEPT_CERTIFICATE,
+
+	__HANDSHAKE_A_ACCEPT_MAX,
+	HANDSHAKE_A_ACCEPT_MAX = (__HANDSHAKE_A_ACCEPT_MAX - 1)
+};
+
+enum {
+	HANDSHAKE_A_DONE_STATUS = 1,
+	HANDSHAKE_A_DONE_SOCKFD,
+	HANDSHAKE_A_DONE_REMOTE_AUTH,
+
+	__HANDSHAKE_A_DONE_MAX,
+	HANDSHAKE_A_DONE_MAX = (__HANDSHAKE_A_DONE_MAX - 1)
+};
+
+enum {
+	HANDSHAKE_CMD_READY = 1,
+	HANDSHAKE_CMD_ACCEPT,
+	HANDSHAKE_CMD_DONE,
+
+	__HANDSHAKE_CMD_MAX,
+	HANDSHAKE_CMD_MAX = (__HANDSHAKE_CMD_MAX - 1)
+};
+
+#define HANDSHAKE_MCGRP_NONE	"none"
+
+#endif /* _UAPI_LINUX_HANDSHAKE_H */
diff --git a/net/Kconfig b/net/Kconfig
index f806722bccf4..4b800706cc76 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -68,6 +68,11 @@ source "net/iucv/Kconfig"
 source "net/smc/Kconfig"
 source "net/xdp/Kconfig"
 
+config NET_HANDSHAKE
+	bool
+	depends on SUNRPC || NVME_TARGET_TCP || NVME_TCP
+	default y
+
 config INET
 	bool "TCP/IP networking"
 	help
diff --git a/net/Makefile b/net/Makefile
index 0914bea9c335..bfee380efb46 100644
--- a/net/Makefile
+++ b/net/Makefile
@@ -79,3 +79,4 @@ obj-$(CONFIG_NET_NCSI)		+= ncsi/
 obj-$(CONFIG_XDP_SOCKETS)	+= xdp/
 obj-$(CONFIG_MPTCP)		+= mptcp/
 obj-$(CONFIG_MCTP)		+= mctp/
+obj-$(CONFIG_NET_HANDSHAKE)	+= handshake/
diff --git a/net/handshake/Makefile b/net/handshake/Makefile
new file mode 100644
index 000000000000..d38736de45da
--- /dev/null
+++ b/net/handshake/Makefile
@@ -0,0 +1,11 @@
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# Makefile for the Generic HANDSHAKE service
+#
+# Author: Chuck Lever <chuck.lever@oracle.com>
+#
+# Copyright (c) 2023, Oracle and/or its affiliates.
+#
+
+obj-y += handshake.o
+handshake-y := genl.o netlink.o request.o trace.o
diff --git a/net/handshake/genl.c b/net/handshake/genl.c
new file mode 100644
index 000000000000..652f37d19bd6
--- /dev/null
+++ b/net/handshake/genl.c
@@ -0,0 +1,57 @@
+// SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)
+/* Do not edit directly, auto-generated from: */
+/*	Documentation/netlink/specs/handshake.yaml */
+/* YNL-GEN kernel source */
+
+#include <net/netlink.h>
+#include <net/genetlink.h>
+
+#include "genl.h"
+
+#include <linux/handshake.h>
+
+/* HANDSHAKE_CMD_ACCEPT - do */
+static const struct nla_policy handshake_accept_nl_policy[HANDSHAKE_A_ACCEPT_HANDLER_CLASS + 1] = {
+	[HANDSHAKE_A_ACCEPT_HANDLER_CLASS] = NLA_POLICY_MAX(NLA_U32, 1),
+};
+
+/* HANDSHAKE_CMD_DONE - do */
+static const struct nla_policy handshake_done_nl_policy[HANDSHAKE_A_DONE_REMOTE_AUTH + 1] = {
+	[HANDSHAKE_A_DONE_STATUS] = { .type = NLA_U32, },
+	[HANDSHAKE_A_DONE_SOCKFD] = { .type = NLA_U32, },
+	[HANDSHAKE_A_DONE_REMOTE_AUTH] = { .type = NLA_U32, },
+};
+
+/* Ops table for handshake */
+static const struct genl_split_ops handshake_nl_ops[] = {
+	{
+		.cmd		= HANDSHAKE_CMD_ACCEPT,
+		.doit		= handshake_nl_accept_doit,
+		.policy		= handshake_accept_nl_policy,
+		.maxattr	= HANDSHAKE_A_ACCEPT_HANDLER_CLASS,
+		.flags		= GENL_ADMIN_PERM | GENL_CMD_CAP_DO,
+	},
+	{
+		.cmd		= HANDSHAKE_CMD_DONE,
+		.doit		= handshake_nl_done_doit,
+		.policy		= handshake_done_nl_policy,
+		.maxattr	= HANDSHAKE_A_DONE_REMOTE_AUTH,
+		.flags		= GENL_CMD_CAP_DO,
+	},
+};
+
+static const struct genl_multicast_group handshake_nl_mcgrps[] = {
+	[HANDSHAKE_NLGRP_NONE] = { "none", },
+};
+
+struct genl_family handshake_nl_family __ro_after_init = {
+	.name		= HANDSHAKE_FAMILY_NAME,
+	.version	= HANDSHAKE_FAMILY_VERSION,
+	.netnsok	= true,
+	.parallel_ops	= true,
+	.module		= THIS_MODULE,
+	.split_ops	= handshake_nl_ops,
+	.n_split_ops	= ARRAY_SIZE(handshake_nl_ops),
+	.mcgrps		= handshake_nl_mcgrps,
+	.n_mcgrps	= ARRAY_SIZE(handshake_nl_mcgrps),
+};
diff --git a/net/handshake/genl.h b/net/handshake/genl.h
new file mode 100644
index 000000000000..a1eb7ccccc7f
--- /dev/null
+++ b/net/handshake/genl.h
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) */
+/* Do not edit directly, auto-generated from: */
+/*	Documentation/netlink/specs/handshake.yaml */
+/* YNL-GEN kernel header */
+
+#ifndef _LINUX_HANDSHAKE_GEN_H
+#define _LINUX_HANDSHAKE_GEN_H
+
+#include <net/netlink.h>
+#include <net/genetlink.h>
+
+#include <linux/handshake.h>
+
+int handshake_nl_accept_doit(struct sk_buff *skb, struct genl_info *info);
+int handshake_nl_done_doit(struct sk_buff *skb, struct genl_info *info);
+
+enum {
+	HANDSHAKE_NLGRP_NONE,
+};
+
+extern struct genl_family handshake_nl_family;
+
+#endif /* _LINUX_HANDSHAKE_GEN_H */
diff --git a/net/handshake/handshake.h b/net/handshake/handshake.h
new file mode 100644
index 000000000000..8d4ede30e0df
--- /dev/null
+++ b/net/handshake/handshake.h
@@ -0,0 +1,81 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Generic netlink handshake service
+ *
+ * Author: Chuck Lever <chuck.lever@oracle.com>
+ *
+ * Copyright (c) 2023, Oracle and/or its affiliates.
+ */
+
+#ifndef _INTERNAL_HANDSHAKE_H
+#define _INTERNAL_HANDSHAKE_H
+
+/* Per-net namespace context */
+struct handshake_net {
+	spinlock_t		hn_lock;	/* protects next 3 fields */
+	int			hn_pending;
+	int			hn_pending_max;
+	struct list_head	hn_requests;
+
+	unsigned long		hn_flags;
+};
+
+enum hn_flags_bits {
+	HANDSHAKE_F_NET_DRAINING,
+};
+
+struct handshake_proto;
+
+/* One handshake request */
+struct handshake_req {
+	struct list_head		hr_list;
+	struct rhash_head		hr_rhash;
+	unsigned long			hr_flags;
+	const struct handshake_proto	*hr_proto;
+	struct sock			*hr_sk;
+	void				(*hr_odestruct)(struct sock *sk);
+
+	/* Always the last field */
+	char				hr_priv[];
+};
+
+enum hr_flags_bits {
+	HANDSHAKE_F_REQ_COMPLETED,
+};
+
+/* Invariants for all handshake requests for one transport layer
+ * security protocol
+ */
+struct handshake_proto {
+	int			hp_handler_class;
+	size_t			hp_privsize;
+
+	int			(*hp_accept)(struct handshake_req *req,
+					     struct genl_info *info, int fd);
+	void			(*hp_done)(struct handshake_req *req,
+					   unsigned int status,
+					   struct genl_info *info);
+	void			(*hp_destroy)(struct handshake_req *req);
+};
+
+/* netlink.c */
+int handshake_genl_notify(struct net *net, int handler_class, gfp_t flags);
+struct nlmsghdr *handshake_genl_put(struct sk_buff *msg,
+				    struct genl_info *info);
+struct handshake_net *handshake_pernet(struct net *net);
+
+/* request.c */
+struct handshake_req *handshake_req_alloc(const struct handshake_proto *proto,
+					  gfp_t flags);
+int handshake_req_hash_init(void);
+void handshake_req_hash_destroy(void);
+void *handshake_req_private(struct handshake_req *req);
+struct handshake_req *handshake_req_hash_lookup(struct sock *sk);
+struct handshake_req *handshake_req_next(struct handshake_net *hn, int class);
+int handshake_req_submit(struct socket *sock, struct handshake_req *req,
+			 gfp_t flags);
+void handshake_complete(struct handshake_req *req, unsigned int status,
+			struct genl_info *info);
+bool handshake_req_cancel(struct socket *sock);
+
+#endif /* _INTERNAL_HANDSHAKE_H */
diff --git a/net/handshake/netlink.c b/net/handshake/netlink.c
new file mode 100644
index 000000000000..0997fd4c6e0b
--- /dev/null
+++ b/net/handshake/netlink.c
@@ -0,0 +1,308 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Generic netlink handshake service
+ *
+ * Author: Chuck Lever <chuck.lever@oracle.com>
+ *
+ * Copyright (c) 2023, Oracle and/or its affiliates.
+ */
+
+#include <linux/types.h>
+#include <linux/socket.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/skbuff.h>
+#include <linux/mm.h>
+
+#include <net/sock.h>
+#include <net/genetlink.h>
+#include <net/netns/generic.h>
+
+#include <uapi/linux/handshake.h>
+#include "handshake.h"
+#include "genl.h"
+
+#include <trace/events/handshake.h>
+
+/**
+ * handshake_genl_notify - Notify handlers that a request is waiting
+ * @net: target network namespace
+ * @handler_class: target handler
+ * @flags: memory allocation control flags
+ *
+ * Returns zero on success or a negative errno if notification failed.
+ */
+int handshake_genl_notify(struct net *net, int handler_class, gfp_t flags)
+{
+	struct sk_buff *msg;
+	void *hdr;
+
+	if (!genl_has_listeners(&handshake_nl_family, net, handler_class))
+		return -ESRCH;
+
+	msg = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!msg)
+		return -ENOMEM;
+
+	hdr = genlmsg_put(msg, 0, 0, &handshake_nl_family, 0,
+			  HANDSHAKE_CMD_READY);
+	if (!hdr)
+		goto out_free;
+
+	if (nla_put_u32(msg, HANDSHAKE_A_ACCEPT_HANDLER_CLASS,
+			handler_class) < 0) {
+		genlmsg_cancel(msg, hdr);
+		goto out_free;
+	}
+
+	genlmsg_end(msg, hdr);
+	return genlmsg_multicast_netns(&handshake_nl_family, net, msg,
+				       0, handler_class, flags);
+
+out_free:
+	nlmsg_free(msg);
+	return -EMSGSIZE;
+}
+
+/**
+ * handshake_genl_put - Create a generic netlink message header
+ * @msg: buffer in which to create the header
+ * @info: generic netlink message context
+ *
+ * Returns a ready-to-use header, or NULL.
+ */
+struct nlmsghdr *handshake_genl_put(struct sk_buff *msg,
+				    struct genl_info *info)
+{
+	return genlmsg_put(msg, info->snd_portid, info->snd_seq,
+			   &handshake_nl_family, 0, info->genlhdr->cmd);
+}
+EXPORT_SYMBOL(handshake_genl_put);
+
+/*
+ * dup() a kernel socket for use as a user space file descriptor
+ * in the current process. The kernel socket must have an
+ * instatiated struct file.
+ *
+ * Implicit argument: "current()"
+ */
+static int handshake_dup(struct socket *sock)
+{
+	struct file *file;
+	int newfd;
+
+	if (!sock->file)
+		return -EBADF;
+
+	file = get_file(sock->file);
+	newfd = get_unused_fd_flags(O_CLOEXEC);
+	if (newfd < 0) {
+		fput(file);
+		return newfd;
+	}
+
+	fd_install(newfd, file);
+	return newfd;
+}
+
+int handshake_nl_accept_doit(struct sk_buff *skb, struct genl_info *info)
+{
+	struct net *net = sock_net(skb->sk);
+	struct handshake_net *hn = handshake_pernet(net);
+	struct handshake_req *req = NULL;
+	struct socket *sock;
+	int class, fd, err;
+
+	err = -EOPNOTSUPP;
+	if (!hn)
+		goto out_status;
+
+	err = -EINVAL;
+	if (GENL_REQ_ATTR_CHECK(info, HANDSHAKE_A_ACCEPT_HANDLER_CLASS))
+		goto out_status;
+	class = nla_get_u32(info->attrs[HANDSHAKE_A_ACCEPT_HANDLER_CLASS]);
+
+	err = -EAGAIN;
+	req = handshake_req_next(hn, class);
+	if (!req)
+		goto out_status;
+
+	sock = req->hr_sk->sk_socket;
+	fd = handshake_dup(sock);
+	if (fd < 0) {
+		err = fd;
+		goto out_complete;
+	}
+	err = req->hr_proto->hp_accept(req, info, fd);
+	if (err)
+		goto out_complete;
+
+	trace_handshake_cmd_accept(net, req, req->hr_sk, fd);
+	return 0;
+
+out_complete:
+	handshake_complete(req, -EIO, NULL);
+	fput(sock->file);
+out_status:
+	trace_handshake_cmd_accept_err(net, req, NULL, err);
+	return err;
+}
+
+int handshake_nl_done_doit(struct sk_buff *skb, struct genl_info *info)
+{
+	struct net *net = sock_net(skb->sk);
+	struct socket *sock = NULL;
+	struct handshake_req *req;
+	int fd, status, err;
+
+	if (GENL_REQ_ATTR_CHECK(info, HANDSHAKE_A_DONE_SOCKFD))
+		return -EINVAL;
+	fd = nla_get_u32(info->attrs[HANDSHAKE_A_DONE_SOCKFD]);
+
+	err = 0;
+	sock = sockfd_lookup(fd, &err);
+	if (err) {
+		err = -EBADF;
+		goto out_status;
+	}
+
+	req = handshake_req_hash_lookup(sock->sk);
+	if (!req) {
+		err = -EBUSY;
+		goto out_status;
+	}
+
+	trace_handshake_cmd_done(net, req, sock->sk, fd);
+
+	status = -EIO;
+	if (info->attrs[HANDSHAKE_A_DONE_STATUS])
+		status = nla_get_u32(info->attrs[HANDSHAKE_A_DONE_STATUS]);
+
+	handshake_complete(req, status, info);
+	fput(sock->file);
+	return 0;
+
+out_status:
+	trace_handshake_cmd_done_err(net, req, sock->sk, err);
+	return err;
+}
+
+static unsigned int handshake_net_id;
+
+static int __net_init handshake_net_init(struct net *net)
+{
+	struct handshake_net *hn = net_generic(net, handshake_net_id);
+	unsigned long tmp;
+	struct sysinfo si;
+
+	/*
+	 * Arbitrary limit to prevent handshakes that do not make
+	 * progress from clogging up the system.
+	 */
+	si_meminfo(&si);
+	tmp = si.totalram / (25 * si.mem_unit);
+	hn->hn_pending_max = clamp(tmp, 3UL, 25UL);
+
+	spin_lock_init(&hn->hn_lock);
+	hn->hn_pending = 0;
+	hn->hn_flags = 0;
+	INIT_LIST_HEAD(&hn->hn_requests);
+	return 0;
+}
+
+static void __net_exit handshake_net_exit(struct net *net)
+{
+	struct handshake_net *hn = net_generic(net, handshake_net_id);
+	struct handshake_req *req;
+	LIST_HEAD(requests);
+
+	/*
+	 * Drain the net's pending list. Requests that have been
+	 * accepted and are in progress will be destroyed when
+	 * the socket is closed.
+	 */
+	spin_lock(&hn->hn_lock);
+	set_bit(HANDSHAKE_F_NET_DRAINING, &hn->hn_flags);
+	list_splice_init(&requests, &hn->hn_requests);
+	spin_unlock(&hn->hn_lock);
+
+	while (!list_empty(&requests)) {
+		req = list_first_entry(&requests, struct handshake_req, hr_list);
+		list_del(&req->hr_list);
+
+		/*
+		 * Requests on this list have not yet been
+		 * accepted, so they do not have an fd to put.
+		 */
+
+		handshake_complete(req, -ETIMEDOUT, NULL);
+	}
+}
+
+static struct pernet_operations __net_initdata handshake_genl_net_ops = {
+	.init		= handshake_net_init,
+	.exit		= handshake_net_exit,
+	.id		= &handshake_net_id,
+	.size		= sizeof(struct handshake_net),
+};
+
+/**
+ * handshake_pernet - Get the handshake private per-net structure
+ * @net: network namespace
+ *
+ * Returns a pointer to the net's private per-net structure for the
+ * handshake module, or NULL if handshake_init() failed.
+ */
+struct handshake_net *handshake_pernet(struct net *net)
+{
+	return handshake_net_id ?
+		net_generic(net, handshake_net_id) : NULL;
+}
+
+static int __init handshake_init(void)
+{
+	int ret;
+
+	ret = handshake_req_hash_init();
+	if (ret) {
+		pr_warn("handshake: hash initialization failed (%d)\n", ret);
+		return ret;
+	}
+
+	ret = genl_register_family(&handshake_nl_family);
+	if (ret) {
+		pr_warn("handshake: netlink registration failed (%d)\n", ret);
+		handshake_req_hash_destroy();
+		return ret;
+	}
+
+	/*
+	 * ORDER: register_pernet_subsys must be done last.
+	 *
+	 *	If initialization does not make it past pernet_subsys
+	 *	registration, then handshake_net_id will remain 0. That
+	 *	shunts the handshake consumer API to return ENOTSUPP
+	 *	to prevent it from dereferencing something that hasn't
+	 *	been allocated.
+	 */
+	ret = register_pernet_subsys(&handshake_genl_net_ops);
+	if (ret) {
+		pr_warn("handshake: pernet registration failed (%d)\n", ret);
+		genl_unregister_family(&handshake_nl_family);
+		handshake_req_hash_destroy();
+	}
+
+	return ret;
+}
+
+static void __exit handshake_exit(void)
+{
+	unregister_pernet_subsys(&handshake_genl_net_ops);
+	handshake_net_id = 0;
+
+	handshake_req_hash_destroy();
+	genl_unregister_family(&handshake_nl_family);
+}
+
+module_init(handshake_init);
+module_exit(handshake_exit);
diff --git a/net/handshake/request.c b/net/handshake/request.c
new file mode 100644
index 000000000000..0ef18a38c047
--- /dev/null
+++ b/net/handshake/request.c
@@ -0,0 +1,344 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Handshake request lifetime events
+ *
+ * Author: Chuck Lever <chuck.lever@oracle.com>
+ *
+ * Copyright (c) 2023, Oracle and/or its affiliates.
+ */
+
+#include <linux/types.h>
+#include <linux/socket.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/skbuff.h>
+#include <linux/inet.h>
+#include <linux/fdtable.h>
+#include <linux/rhashtable.h>
+
+#include <net/sock.h>
+#include <net/genetlink.h>
+#include <net/netns/generic.h>
+
+#include <uapi/linux/handshake.h>
+#include "handshake.h"
+
+#include <trace/events/handshake.h>
+
+/*
+ * We need both a handshake_req -> sock mapping, and a sock ->
+ * handshake_req mapping. Both are one-to-one.
+ *
+ * To avoid adding another pointer field to struct sock, net/handshake
+ * maintains a hash table, indexed by the memory address of @sock, to
+ * find the struct handshake_req outstanding for that socket. The
+ * reverse direction uses a simple pointer field in the handshake_req
+ * struct.
+ */
+
+static struct rhashtable handshake_rhashtbl ____cacheline_aligned_in_smp;
+
+static const struct rhashtable_params handshake_rhash_params = {
+	.key_len		= sizeof_field(struct handshake_req, hr_sk),
+	.key_offset		= offsetof(struct handshake_req, hr_sk),
+	.head_offset		= offsetof(struct handshake_req, hr_rhash),
+	.automatic_shrinking	= true,
+};
+
+int handshake_req_hash_init(void)
+{
+	return rhashtable_init(&handshake_rhashtbl, &handshake_rhash_params);
+}
+
+void handshake_req_hash_destroy(void)
+{
+	rhashtable_destroy(&handshake_rhashtbl);
+}
+
+struct handshake_req *handshake_req_hash_lookup(struct sock *sk)
+{
+	return rhashtable_lookup_fast(&handshake_rhashtbl, &sk,
+				      handshake_rhash_params);
+}
+
+static bool handshake_req_hash_add(struct handshake_req *req)
+{
+	int ret;
+
+	ret = rhashtable_lookup_insert_fast(&handshake_rhashtbl,
+					    &req->hr_rhash,
+					    handshake_rhash_params);
+	return ret == 0;
+}
+
+static void handshake_req_destroy(struct handshake_req *req)
+{
+	if (req->hr_proto->hp_destroy)
+		req->hr_proto->hp_destroy(req);
+	rhashtable_remove_fast(&handshake_rhashtbl, &req->hr_rhash,
+			       handshake_rhash_params);
+	kfree(req);
+}
+
+static void handshake_sk_destruct(struct sock *sk)
+{
+	void (*sk_destruct)(struct sock *sk);
+	struct handshake_req *req;
+
+	req = handshake_req_hash_lookup(sk);
+	if (!req)
+		return;
+
+	trace_handshake_destruct(sock_net(sk), req, sk);
+	sk_destruct = req->hr_odestruct;
+	handshake_req_destroy(req);
+	if (sk_destruct)
+		sk_destruct(sk);
+}
+
+/**
+ * handshake_req_alloc - Allocate a handshake request
+ * @proto: security protocol
+ * @flags: memory allocation flags
+ *
+ * Returns an initialized handshake_req or NULL.
+ */
+struct handshake_req *handshake_req_alloc(const struct handshake_proto *proto,
+					  gfp_t flags)
+{
+	struct handshake_req *req;
+
+	if (!proto)
+		return NULL;
+	if (proto->hp_handler_class <= HANDSHAKE_HANDLER_CLASS_NONE)
+		return NULL;
+	if (proto->hp_handler_class >= HANDSHAKE_HANDLER_CLASS_MAX)
+		return NULL;
+	if (!proto->hp_accept || !proto->hp_done)
+		return NULL;
+
+	req = kzalloc(struct_size(req, hr_priv, proto->hp_privsize), flags);
+	if (!req)
+		return NULL;
+
+	INIT_LIST_HEAD(&req->hr_list);
+	req->hr_proto = proto;
+	return req;
+}
+EXPORT_SYMBOL(handshake_req_alloc);
+
+/**
+ * handshake_req_private - Get per-handshake private data
+ * @req: handshake arguments
+ *
+ */
+void *handshake_req_private(struct handshake_req *req)
+{
+	return (void *)&req->hr_priv;
+}
+EXPORT_SYMBOL(handshake_req_private);
+
+static bool __add_pending_locked(struct handshake_net *hn,
+				 struct handshake_req *req)
+{
+	if (WARN_ON_ONCE(!list_empty(&req->hr_list)))
+		return false;
+	hn->hn_pending++;
+	list_add_tail(&req->hr_list, &hn->hn_requests);
+	return true;
+}
+
+static void __remove_pending_locked(struct handshake_net *hn,
+				    struct handshake_req *req)
+{
+	hn->hn_pending--;
+	list_del_init(&req->hr_list);
+}
+
+/*
+ * Returns %true if the request was found on @net's pending list,
+ * otherwise %false.
+ *
+ * If @req was on a pending list, it has not yet been accepted.
+ */
+static bool remove_pending(struct handshake_net *hn, struct handshake_req *req)
+{
+	bool ret = false;
+
+	spin_lock(&hn->hn_lock);
+	if (!list_empty(&req->hr_list)) {
+		__remove_pending_locked(hn, req);
+		ret = true;
+	}
+	spin_unlock(&hn->hn_lock);
+
+	return ret;
+}
+
+struct handshake_req *handshake_req_next(struct handshake_net *hn, int class)
+{
+	struct handshake_req *req, *pos;
+
+	req = NULL;
+	spin_lock(&hn->hn_lock);
+	list_for_each_entry(pos, &hn->hn_requests, hr_list) {
+		if (pos->hr_proto->hp_handler_class != class)
+			continue;
+		__remove_pending_locked(hn, pos);
+		req = pos;
+		break;
+	}
+	spin_unlock(&hn->hn_lock);
+
+	return req;
+}
+
+/**
+ * handshake_req_submit - Submit a handshake request
+ * @sock: open socket on which to perform the handshake
+ * @req: handshake arguments
+ * @flags: memory allocation flags
+ *
+ * Return values:
+ *   %0: Request queued
+ *   %-EINVAL: Invalid argument
+ *   %-EBUSY: A handshake is already under way for this socket
+ *   %-ESRCH: No handshake agent is available
+ *   %-EAGAIN: Too many pending handshake requests
+ *   %-ENOMEM: Failed to allocate memory
+ *   %-EMSGSIZE: Failed to construct notification message
+ *   %-EOPNOTSUPP: Handshake module not initialized
+ *
+ * A zero return value from handshake_req_submit() means that
+ * exactly one subsequent completion callback is guaranteed.
+ *
+ * A negative return value from handshake_req_submit() means that
+ * no completion callback will be done and that @req has been
+ * destroyed.
+ */
+int handshake_req_submit(struct socket *sock, struct handshake_req *req,
+			 gfp_t flags)
+{
+	struct handshake_net *hn;
+	struct net *net;
+	int ret;
+
+	if (!sock || !req || !sock->file) {
+		kfree(req);
+		return -EINVAL;
+	}
+
+	req->hr_sk = sock->sk;
+	if (!req->hr_sk) {
+		kfree(req);
+		return -EINVAL;
+	}
+	req->hr_odestruct = req->hr_sk->sk_destruct;
+	req->hr_sk->sk_destruct = handshake_sk_destruct;
+
+	ret = -EOPNOTSUPP;
+	net = sock_net(req->hr_sk);
+	hn = handshake_pernet(net);
+	if (!hn)
+		goto out_err;
+
+	ret = -EAGAIN;
+	if (READ_ONCE(hn->hn_pending) >= hn->hn_pending_max)
+		goto out_err;
+
+	spin_lock(&hn->hn_lock);
+	ret = -EOPNOTSUPP;
+	if (test_bit(HANDSHAKE_F_NET_DRAINING, &hn->hn_flags))
+		goto out_unlock;
+	ret = -EBUSY;
+	if (!handshake_req_hash_add(req))
+		goto out_unlock;
+	if (!__add_pending_locked(hn, req))
+		goto out_unlock;
+	spin_unlock(&hn->hn_lock);
+
+	ret = handshake_genl_notify(net, req->hr_proto->hp_handler_class,
+				    flags);
+	if (ret) {
+		trace_handshake_notify_err(net, req, req->hr_sk, ret);
+		if (remove_pending(hn, req))
+			goto out_err;
+	}
+
+	/* Prevent socket release while a handshake request is pending */
+	sock_hold(req->hr_sk);
+
+	trace_handshake_submit(net, req, req->hr_sk);
+	return 0;
+
+out_unlock:
+	spin_unlock(&hn->hn_lock);
+out_err:
+	trace_handshake_submit_err(net, req, req->hr_sk, ret);
+	handshake_req_destroy(req);
+	return ret;
+}
+EXPORT_SYMBOL(handshake_req_submit);
+
+void handshake_complete(struct handshake_req *req, unsigned int status,
+			struct genl_info *info)
+{
+	struct sock *sk = req->hr_sk;
+	struct net *net = sock_net(sk);
+
+	if (!test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED, &req->hr_flags)) {
+		trace_handshake_complete(net, req, sk, status);
+		req->hr_proto->hp_done(req, status, info);
+
+		/* Handshake request is no longer pending */
+		sock_put(sk);
+	}
+}
+
+/**
+ * handshake_req_cancel - Cancel an in-progress handshake
+ * @sock: socket on which there is an ongoing handshake
+ *
+ * Request cancellation races with request completion. To determine
+ * who won, callers examine the return value from this function.
+ *
+ * Return values:
+ *   %true - Uncompleted handshake request was canceled or not found
+ *   %false - Handshake request already completed
+ */
+bool handshake_req_cancel(struct socket *sock)
+{
+	struct handshake_req *req;
+	struct handshake_net *hn;
+	struct sock *sk;
+	struct net *net;
+
+	sk = sock->sk;
+	net = sock_net(sk);
+	req = handshake_req_hash_lookup(sk);
+	if (!req) {
+		trace_handshake_cancel_none(net, req, sk);
+		return true;
+	}
+
+	hn = handshake_pernet(net);
+	if (hn && remove_pending(hn, req)) {
+		/* Request hadn't been accepted */
+		trace_handshake_cancel(net, req, sk);
+		sock_put(sk);
+		return true;
+	}
+	if (test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED, &req->hr_flags)) {
+		/* Request already completed */
+		trace_handshake_cancel_busy(net, req, sk);
+		return false;
+	}
+
+	trace_handshake_cancel(net, req, sk);
+
+	/* Handshake request is no longer pending */
+	sock_put(sk);
+
+	return true;
+}
+EXPORT_SYMBOL(handshake_req_cancel);
diff --git a/net/handshake/trace.c b/net/handshake/trace.c
new file mode 100644
index 000000000000..1c4d8e27e17a
--- /dev/null
+++ b/net/handshake/trace.c
@@ -0,0 +1,20 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Trace points for transport security layer handshakes.
+ *
+ * Author: Chuck Lever <chuck.lever@oracle.com>
+ *
+ * Copyright (c) 2023, Oracle and/or its affiliates.
+ */
+
+#include <linux/types.h>
+
+#include <net/sock.h>
+#include <net/netlink.h>
+#include <net/genetlink.h>
+
+#include "handshake.h"
+
+#define CREATE_TRACE_POINTS
+
+#include <trace/events/handshake.h>



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v8 2/4] net/handshake: Add a kernel API for requesting a TLSv1.3 handshake
  2023-04-03 18:45 [PATCH v8 0/4] Another crack at a handshake upcall mechanism Chuck Lever
  2023-04-03 18:46 ` [PATCH v8 1/4] net/handshake: Create a NETLINK service for handling handshake requests Chuck Lever
@ 2023-04-03 18:46 ` Chuck Lever
  2023-04-03 18:46 ` [PATCH v8 3/4] net/handshake: Add Kunit tests for the handshake consumer API Chuck Lever
  2023-04-03 18:46 ` [PATCH v8 4/4] SUNRPC: Recognize control messages in server-side TCP socket code Chuck Lever
  3 siblings, 0 replies; 16+ messages in thread
From: Chuck Lever @ 2023-04-03 18:46 UTC (permalink / raw)
  To: kuba, pabeni, edumazet, borisp; +Cc: netdev, kernel-tls-handshake, john.haxby

From: Chuck Lever <chuck.lever@oracle.com>

To enable kernel consumers of TLS to request a TLS handshake, add
support to net/handshake/ to request a handshake upcall.

This patch also acts as a template for adding handshake upcall
support for other kernel transport layer security providers.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 Documentation/netlink/specs/handshake.yaml |    4 
 Documentation/networking/index.rst         |    1 
 Documentation/networking/tls-handshake.rst |  217 +++++++++++++++
 MAINTAINERS                                |    2 
 include/net/handshake.h                    |   43 +++
 include/uapi/linux/handshake.h             |    2 
 net/handshake/Makefile                     |    2 
 net/handshake/genl.c                       |    3 
 net/handshake/genl.h                       |    1 
 net/handshake/tlshd.c                      |  417 ++++++++++++++++++++++++++++
 10 files changed, 689 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/networking/tls-handshake.rst
 create mode 100644 include/net/handshake.h
 create mode 100644 net/handshake/tlshd.c

diff --git a/Documentation/netlink/specs/handshake.yaml b/Documentation/netlink/specs/handshake.yaml
index 0333d92b1438..614f1a585511 100644
--- a/Documentation/netlink/specs/handshake.yaml
+++ b/Documentation/netlink/specs/handshake.yaml
@@ -16,7 +16,7 @@ definitions:
     type: enum
     name: handler-class
     value-start: 0
-    entries: [ none, max ]
+    entries: [ none, tlshd, max ]
   -
     type: enum
     name: msg-type
@@ -120,3 +120,5 @@ mcast-groups:
   list:
     -
       name: none
+    -
+      name: tlshd
diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
index 24bb256d6d53..a164ff074356 100644
--- a/Documentation/networking/index.rst
+++ b/Documentation/networking/index.rst
@@ -36,6 +36,7 @@ Contents:
    scaling
    tls
    tls-offload
+   tls-handshake
    nfc
    6lowpan
    6pack
diff --git a/Documentation/networking/tls-handshake.rst b/Documentation/networking/tls-handshake.rst
new file mode 100644
index 000000000000..a2817a88e905
--- /dev/null
+++ b/Documentation/networking/tls-handshake.rst
@@ -0,0 +1,217 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=======================
+In-Kernel TLS Handshake
+=======================
+
+Overview
+========
+
+Transport Layer Security (TLS) is a Upper Layer Protocol (ULP) that runs
+over TCP. TLS provides end-to-end data integrity and confidentiality in
+addition to peer authentication.
+
+The kernel's kTLS implementation handles the TLS record subprotocol, but
+does not handle the TLS handshake subprotocol which is used to establish
+a TLS session. Kernel consumers can use the API described here to
+request TLS session establishment.
+
+There are several possible ways to provide a handshake service in the
+kernel. The API described here is designed to hide the details of those
+implementations so that in-kernel TLS consumers do not need to be
+aware of how the handshake gets done.
+
+
+User handshake agent
+====================
+
+As of this writing, there is no TLS handshake implementation in the
+Linux kernel. To provide a handshake service, a handshake agent
+(typically in user space) is started in each network namespace where a
+kernel consumer might require a TLS handshake. Handshake agents listen
+for events sent from the kernel that indicate a handshake request is
+waiting.
+
+An open socket is passed to a handshake agent via a netlink operation,
+which creates a socket descriptor in the agent's file descriptor table.
+If the handshake completes successfully, the handshake agent promotes
+the socket to use the TLS ULP and sets the session information using the
+SOL_TLS socket options. The handshake agent returns the socket to the
+kernel via a second netlink operation.
+
+
+Kernel Handshake API
+====================
+
+A kernel TLS consumer initiates a client-side TLS handshake on an open
+socket by invoking one of the tls_client_hello() functions. First, it
+fills in a structure that contains the parameters of the request:
+
+.. code-block:: c
+
+  struct tls_handshake_args {
+        struct socket   *ta_sock;
+        tls_done_func_t ta_done;
+        void            *ta_data;
+        unsigned int    ta_timeout_ms;
+        key_serial_t    ta_keyring;
+        key_serial_t    ta_my_cert;
+        key_serial_t    ta_my_privkey;
+        unsigned int    ta_num_peerids;
+        key_serial_t    ta_my_peerids[5];
+  };
+
+The @ta_sock field references an open and connected socket. The consumer
+must hold a reference on the socket to prevent it from being destroyed
+while the handshake is in progress. The consumer must also have
+instantiated a struct file in sock->file.
+
+
+@ta_done contains a callback function that is invoked when the handshake
+has completed. Further explanation of this function is in the "Handshake
+Completion" sesction below.
+
+The consumer can fill in the @ta_timeout_ms field to force the servicing
+handshake agent to exit after a number of milliseconds. This enables the
+socket to be fully closed once both the kernel and the handshake agent
+have closed their endpoints.
+
+Authentication material such as x.509 certificates, private certificate
+keys, and pre-shared keys are provided to the handshake agent in keys
+that are instantiated by the consumer before making the handshake
+request. The consumer can provide a private keyring that is linked into
+the handshake agent's process keyring in the @ta_keyring field to prevent
+access of those keys by other subsystems.
+
+To request an x.509-authenticated TLS session, the consumer fills in
+the @ta_my_cert and @ta_my_privkey fields with the serial numbers of
+keys containing an x.509 certificate and the private key for that
+certificate. Then, it invokes this function:
+
+.. code-block:: c
+
+  ret = tls_client_hello_x509(args, gfp_flags);
+
+The function returns zero when the handshake request is under way. A
+zero return guarantees the callback function @ta_done will be invoked
+for this socket. The function returns a negative errno if the handshake
+could not be started. A negative errno guarantees the callback function
+@ta_done will not be invoked on this socket.
+
+
+To initiate a client-side TLS handshake with a pre-shared key, use:
+
+.. code-block:: c
+
+  ret = tls_client_hello_psk(args, gfp_flags);
+
+However, in this case, the consumer fills in the @ta_my_peerids array
+with serial numbers of keys containing the peer identities it wishes
+to offer, and the @ta_num_peerids field with the number of array
+entries it has filled in. The other fields are filled in as above.
+
+
+To initiate an anonymous client-side TLS handshake use:
+
+.. code-block:: c
+
+  ret = tls_client_hello_anon(args, gfp_flags);
+
+The handshake agent presents no peer identity information to the remote
+during this type of handshake. Only server authentication (ie the client
+verifies the server's identity) is performed during the handshake. Thus
+the established session uses encryption only.
+
+
+Consumers that are in-kernel servers use:
+
+.. code-block:: c
+
+  ret = tls_server_hello_x509(args, gfp_flags);
+
+or
+
+.. code-block:: c
+
+  ret = tls_server_hello_psk(args, gfp_flags);
+
+The argument structure is filled in as above.
+
+
+If the consumer needs to cancel the handshake request, say, due to a ^C
+or other exigent event, the consumer can invoke:
+
+.. code-block:: c
+
+  bool tls_handshake_cancel(sock);
+
+This function returns true if the handshake request associated with
+@sock has been canceled. The consumer's handshake completion callback
+will not be invoked. If this function returns false, then the consumer's
+completion callback has already been invoked.
+
+
+Handshake Completion
+====================
+
+When the handshake agent has completed processing, it notifies the
+kernel that the socket may be used by the consumer again. At this point,
+the consumer's handshake completion callback, provided in the @ta_done
+field in the tls_handshake_args structure, is invoked.
+
+The synopsis of this function is:
+
+.. code-block:: c
+
+  typedef void	(*tls_done_func_t)(void *data, int status,
+                                   key_serial_t peerid);
+
+The consumer provides a cookie in the @ta_data field of the
+tls_handshake_args structure that is returned in the @data parameter of
+this callback. The consumer uses the cookie to match the callback to the
+thread waiting for the handshake to complete.
+
+The success status of the handshake is returned via the @status
+parameter:
+
++------------+----------------------------------------------+
+|  status    |  meaning                                     |
++============+==============================================+
+|  0         |  TLS session established successfully        |
++------------+----------------------------------------------+
+|  -EACCESS  |  Remote peer rejected the handshake or       |
+|            |  authentication failed                       |
++------------+----------------------------------------------+
+|  -ENOMEM   |  Temporary resource allocation failure       |
++------------+----------------------------------------------+
+|  -EINVAL   |  Consumer provided an invalid argument       |
++------------+----------------------------------------------+
+|  -ENOKEY   |  Missing authentication material             |
++------------+----------------------------------------------+
+|  -EIO      |  An unexpected fault occurred                |
++------------+----------------------------------------------+
+
+The @peerid parameter contains the serial number of a key containing the
+remote peer's identity or the value TLS_NO_PEERID if the session is not
+authenticated.
+
+A best practice is to close and destroy the socket immediately if the
+handshake failed.
+
+
+Other considerations
+--------------------
+
+While a handshake is under way, the kernel consumer must alter the
+socket's sk_data_ready callback function to ignore all incoming data.
+Once the handshake completion callback function has been invoked, normal
+receive operation can be resumed.
+
+Once a TLS session is established, the consumer must provide a buffer
+for and then examine the control message (CMSG) that is part of every
+subsequent sock_recvmsg(). Each control message indicates whether the
+received message data is TLS record data or session metadata.
+
+See tls.rst for details on how a kTLS consumer recognizes incoming
+(decrypted) application data, alerts, and handshake packets once the
+socket has been promoted to use the TLS ULP.
diff --git a/MAINTAINERS b/MAINTAINERS
index d78d67ac7f89..ef7edd7e9097 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8946,6 +8946,8 @@ L:	kernel-tls-handshake@lists.linux.dev
 L:	netdev@vger.kernel.org
 S:	Maintained
 F:	Documentation/netlink/specs/handshake.yaml
+F:	Documentation/networking/tls-handshake.rst
+F:	include/net/handshake.h
 F:	include/trace/events/handshake.h
 F:	net/handshake/
 
diff --git a/include/net/handshake.h b/include/net/handshake.h
new file mode 100644
index 000000000000..6ff4ba77830e
--- /dev/null
+++ b/include/net/handshake.h
@@ -0,0 +1,43 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Generic netlink HANDSHAKE service.
+ *
+ * Author: Chuck Lever <chuck.lever@oracle.com>
+ *
+ * Copyright (c) 2023, Oracle and/or its affiliates.
+ */
+
+#ifndef _NET_HANDSHAKE_H
+#define _NET_HANDSHAKE_H
+
+enum {
+	TLS_NO_KEYRING = 0,
+	TLS_NO_PEERID = 0,
+	TLS_NO_CERT = 0,
+	TLS_NO_PRIVKEY = 0,
+};
+
+typedef void	(*tls_done_func_t)(void *data, int status,
+				   key_serial_t peerid);
+
+struct tls_handshake_args {
+	struct socket		*ta_sock;
+	tls_done_func_t		ta_done;
+	void			*ta_data;
+	unsigned int		ta_timeout_ms;
+	key_serial_t		ta_keyring;
+	key_serial_t		ta_my_cert;
+	key_serial_t		ta_my_privkey;
+	unsigned int		ta_num_peerids;
+	key_serial_t		ta_my_peerids[5];
+};
+
+int tls_client_hello_anon(const struct tls_handshake_args *args, gfp_t flags);
+int tls_client_hello_x509(const struct tls_handshake_args *args, gfp_t flags);
+int tls_client_hello_psk(const struct tls_handshake_args *args, gfp_t flags);
+int tls_server_hello_x509(const struct tls_handshake_args *args, gfp_t flags);
+int tls_server_hello_psk(const struct tls_handshake_args *args, gfp_t flags);
+
+bool tls_handshake_cancel(struct socket *sock);
+
+#endif /* _NET_HANDSHAKE_H */
diff --git a/include/uapi/linux/handshake.h b/include/uapi/linux/handshake.h
index 7f66ff489b87..1de4d0b95325 100644
--- a/include/uapi/linux/handshake.h
+++ b/include/uapi/linux/handshake.h
@@ -11,6 +11,7 @@
 
 enum handshake_handler_class {
 	HANDSHAKE_HANDLER_CLASS_NONE,
+	HANDSHAKE_HANDLER_CLASS_TLSHD,
 	HANDSHAKE_HANDLER_CLASS_MAX,
 };
 
@@ -67,5 +68,6 @@ enum {
 };
 
 #define HANDSHAKE_MCGRP_NONE	"none"
+#define HANDSHAKE_MCGRP_TLSHD	"tlshd"
 
 #endif /* _UAPI_LINUX_HANDSHAKE_H */
diff --git a/net/handshake/Makefile b/net/handshake/Makefile
index d38736de45da..a089f7e3df24 100644
--- a/net/handshake/Makefile
+++ b/net/handshake/Makefile
@@ -8,4 +8,4 @@
 #
 
 obj-y += handshake.o
-handshake-y := genl.o netlink.o request.o trace.o
+handshake-y := genl.o netlink.o request.o tlshd.o trace.o
diff --git a/net/handshake/genl.c b/net/handshake/genl.c
index 652f37d19bd6..9f29efb1493e 100644
--- a/net/handshake/genl.c
+++ b/net/handshake/genl.c
@@ -12,7 +12,7 @@
 
 /* HANDSHAKE_CMD_ACCEPT - do */
 static const struct nla_policy handshake_accept_nl_policy[HANDSHAKE_A_ACCEPT_HANDLER_CLASS + 1] = {
-	[HANDSHAKE_A_ACCEPT_HANDLER_CLASS] = NLA_POLICY_MAX(NLA_U32, 1),
+	[HANDSHAKE_A_ACCEPT_HANDLER_CLASS] = NLA_POLICY_MAX(NLA_U32, 2),
 };
 
 /* HANDSHAKE_CMD_DONE - do */
@@ -42,6 +42,7 @@ static const struct genl_split_ops handshake_nl_ops[] = {
 
 static const struct genl_multicast_group handshake_nl_mcgrps[] = {
 	[HANDSHAKE_NLGRP_NONE] = { "none", },
+	[HANDSHAKE_NLGRP_TLSHD] = { "tlshd", },
 };
 
 struct genl_family handshake_nl_family __ro_after_init = {
diff --git a/net/handshake/genl.h b/net/handshake/genl.h
index a1eb7ccccc7f..2c1f1aa6a02a 100644
--- a/net/handshake/genl.h
+++ b/net/handshake/genl.h
@@ -16,6 +16,7 @@ int handshake_nl_done_doit(struct sk_buff *skb, struct genl_info *info);
 
 enum {
 	HANDSHAKE_NLGRP_NONE,
+	HANDSHAKE_NLGRP_TLSHD,
 };
 
 extern struct genl_family handshake_nl_family;
diff --git a/net/handshake/tlshd.c b/net/handshake/tlshd.c
new file mode 100644
index 000000000000..c6f19908bae6
--- /dev/null
+++ b/net/handshake/tlshd.c
@@ -0,0 +1,417 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Establish a TLS session for a kernel socket consumer
+ * using the tlshd user space handler.
+ *
+ * Author: Chuck Lever <chuck.lever@oracle.com>
+ *
+ * Copyright (c) 2021-2023, Oracle and/or its affiliates.
+ */
+
+#include <linux/types.h>
+#include <linux/socket.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/key.h>
+
+#include <net/sock.h>
+#include <net/handshake.h>
+#include <net/genetlink.h>
+
+#include <uapi/linux/keyctl.h>
+#include <uapi/linux/handshake.h>
+#include "handshake.h"
+
+struct tls_handshake_req {
+	void			(*th_consumer_done)(void *data, int status,
+						    key_serial_t peerid);
+	void			*th_consumer_data;
+
+	int			th_type;
+	unsigned int		th_timeout_ms;
+	int			th_auth_mode;
+	key_serial_t		th_keyring;
+	key_serial_t		th_certificate;
+	key_serial_t		th_privkey;
+
+	unsigned int		th_num_peerids;
+	key_serial_t		th_peerid[5];
+};
+
+static struct tls_handshake_req *
+tls_handshake_req_init(struct handshake_req *req,
+		       const struct tls_handshake_args *args)
+{
+	struct tls_handshake_req *treq = handshake_req_private(req);
+
+	treq->th_timeout_ms = args->ta_timeout_ms;
+	treq->th_consumer_done = args->ta_done;
+	treq->th_consumer_data = args->ta_data;
+	treq->th_keyring = args->ta_keyring;
+	treq->th_num_peerids = 0;
+	treq->th_certificate = TLS_NO_CERT;
+	treq->th_privkey = TLS_NO_PRIVKEY;
+	return treq;
+}
+
+static void tls_handshake_remote_peerids(struct tls_handshake_req *treq,
+					 struct genl_info *info)
+{
+	struct nlattr *head = nlmsg_attrdata(info->nlhdr, GENL_HDRLEN);
+	int rem, len = nlmsg_attrlen(info->nlhdr, GENL_HDRLEN);
+	struct nlattr *nla;
+	unsigned int i;
+
+	i = 0;
+	nla_for_each_attr(nla, head, len, rem) {
+		if (nla_type(nla) == HANDSHAKE_A_DONE_REMOTE_AUTH)
+			i++;
+	}
+	if (!i)
+		return;
+	treq->th_num_peerids = min_t(unsigned int, i,
+				     ARRAY_SIZE(treq->th_peerid));
+
+	i = 0;
+	nla_for_each_attr(nla, head, len, rem) {
+		if (nla_type(nla) == HANDSHAKE_A_DONE_REMOTE_AUTH)
+			treq->th_peerid[i++] = nla_get_u32(nla);
+		if (i >= treq->th_num_peerids)
+			break;
+	}
+}
+
+/**
+ * tls_handshake_done - callback to handle a CMD_DONE request
+ * @req: socket on which the handshake was performed
+ * @status: session status code
+ * @info: full results of session establishment
+ *
+ */
+static void tls_handshake_done(struct handshake_req *req,
+			       unsigned int status, struct genl_info *info)
+{
+	struct tls_handshake_req *treq = handshake_req_private(req);
+
+	treq->th_peerid[0] = TLS_NO_PEERID;
+	if (info)
+		tls_handshake_remote_peerids(treq, info);
+
+	treq->th_consumer_done(treq->th_consumer_data, -status,
+			       treq->th_peerid[0]);
+}
+
+#if IS_ENABLED(CONFIG_KEYS)
+static int tls_handshake_private_keyring(struct tls_handshake_req *treq)
+{
+	key_ref_t process_keyring_ref, keyring_ref;
+	int ret;
+
+	if (treq->th_keyring == TLS_NO_KEYRING)
+		return 0;
+
+	process_keyring_ref = lookup_user_key(KEY_SPEC_PROCESS_KEYRING,
+					      KEY_LOOKUP_CREATE,
+					      KEY_NEED_WRITE);
+	if (IS_ERR(process_keyring_ref)) {
+		ret = PTR_ERR(process_keyring_ref);
+		goto out;
+	}
+
+	keyring_ref = lookup_user_key(treq->th_keyring, KEY_LOOKUP_CREATE,
+				      KEY_NEED_LINK);
+	if (IS_ERR(keyring_ref)) {
+		ret = PTR_ERR(keyring_ref);
+		goto out_put_key;
+	}
+
+	ret = key_link(key_ref_to_ptr(process_keyring_ref),
+		       key_ref_to_ptr(keyring_ref));
+
+	key_ref_put(keyring_ref);
+out_put_key:
+	key_ref_put(process_keyring_ref);
+out:
+	return ret;
+}
+#else
+static int tls_handshake_private_keyring(struct tls_handshake_req *treq)
+{
+	return 0;
+}
+#endif
+
+static int tls_handshake_put_peer_identity(struct sk_buff *msg,
+					   struct tls_handshake_req *treq)
+{
+	unsigned int i;
+
+	for (i = 0; i < treq->th_num_peerids; i++)
+		if (nla_put_u32(msg, HANDSHAKE_A_ACCEPT_PEER_IDENTITY,
+				treq->th_peerid[i]) < 0)
+			return -EMSGSIZE;
+	return 0;
+}
+
+static int tls_handshake_put_certificate(struct sk_buff *msg,
+					 struct tls_handshake_req *treq)
+{
+	struct nlattr *entry_attr;
+
+	if (treq->th_certificate == TLS_NO_CERT &&
+	    treq->th_privkey == TLS_NO_PRIVKEY)
+		return 0;
+
+	entry_attr = nla_nest_start(msg, HANDSHAKE_A_ACCEPT_CERTIFICATE);
+	if (!entry_attr)
+		return -EMSGSIZE;
+
+	if (nla_put_u32(msg, HANDSHAKE_A_X509_CERT,
+			treq->th_certificate) ||
+	    nla_put_u32(msg, HANDSHAKE_A_X509_PRIVKEY,
+			treq->th_privkey)) {
+		nla_nest_cancel(msg, entry_attr);
+		return -EMSGSIZE;
+	}
+
+	nla_nest_end(msg, entry_attr);
+	return 0;
+}
+
+/**
+ * tls_handshake_accept - callback to construct a CMD_ACCEPT response
+ * @req: handshake parameters to return
+ * @info: generic netlink message context
+ * @fd: file descriptor to be returned
+ *
+ * Returns zero on success, or a negative errno on failure.
+ */
+static int tls_handshake_accept(struct handshake_req *req,
+				struct genl_info *info, int fd)
+{
+	struct tls_handshake_req *treq = handshake_req_private(req);
+	struct nlmsghdr *hdr;
+	struct sk_buff *msg;
+	int ret;
+
+	ret = tls_handshake_private_keyring(treq);
+	if (ret < 0)
+		goto out;
+
+	ret = -ENOMEM;
+	msg = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!msg)
+		goto out;
+	hdr = handshake_genl_put(msg, info);
+	if (!hdr)
+		goto out_cancel;
+
+	ret = -EMSGSIZE;
+	ret = nla_put_u32(msg, HANDSHAKE_A_ACCEPT_SOCKFD, fd);
+	if (ret < 0)
+		goto out_cancel;
+	ret = nla_put_u32(msg, HANDSHAKE_A_ACCEPT_MESSAGE_TYPE, treq->th_type);
+	if (ret < 0)
+		goto out_cancel;
+	if (treq->th_timeout_ms) {
+		ret = nla_put_u32(msg, HANDSHAKE_A_ACCEPT_TIMEOUT, treq->th_timeout_ms);
+		if (ret < 0)
+			goto out_cancel;
+	}
+
+	ret = nla_put_u32(msg, HANDSHAKE_A_ACCEPT_AUTH_MODE,
+			  treq->th_auth_mode);
+	if (ret < 0)
+		goto out_cancel;
+	switch (treq->th_auth_mode) {
+	case HANDSHAKE_AUTH_PSK:
+		ret = tls_handshake_put_peer_identity(msg, treq);
+		if (ret < 0)
+			goto out_cancel;
+		break;
+	case HANDSHAKE_AUTH_X509:
+		ret = tls_handshake_put_certificate(msg, treq);
+		if (ret < 0)
+			goto out_cancel;
+		break;
+	}
+
+	genlmsg_end(msg, hdr);
+	return genlmsg_reply(msg, info);
+
+out_cancel:
+	genlmsg_cancel(msg, hdr);
+out:
+	return ret;
+}
+
+static const struct handshake_proto tls_handshake_proto = {
+	.hp_handler_class	= HANDSHAKE_HANDLER_CLASS_TLSHD,
+	.hp_privsize		= sizeof(struct tls_handshake_req),
+
+	.hp_accept		= tls_handshake_accept,
+	.hp_done		= tls_handshake_done,
+};
+
+/**
+ * tls_client_hello_anon - request an anonymous TLS handshake on a socket
+ * @args: socket and handshake parameters for this request
+ * @flags: memory allocation control flags
+ *
+ * Return values:
+ *   %0: Handshake request enqueue; ->done will be called when complete
+ *   %-ESRCH: No user agent is available
+ *   %-ENOMEM: Memory allocation failed
+ */
+int tls_client_hello_anon(const struct tls_handshake_args *args, gfp_t flags)
+{
+	struct tls_handshake_req *treq;
+	struct handshake_req *req;
+
+	req = handshake_req_alloc(&tls_handshake_proto, flags);
+	if (!req)
+		return -ENOMEM;
+	treq = tls_handshake_req_init(req, args);
+	treq->th_type = HANDSHAKE_MSG_TYPE_CLIENTHELLO;
+	treq->th_auth_mode = HANDSHAKE_AUTH_UNAUTH;
+
+	return handshake_req_submit(args->ta_sock, req, flags);
+}
+EXPORT_SYMBOL(tls_client_hello_anon);
+
+/**
+ * tls_client_hello_x509 - request an x.509-based TLS handshake on a socket
+ * @args: socket and handshake parameters for this request
+ * @flags: memory allocation control flags
+ *
+ * Return values:
+ *   %0: Handshake request enqueue; ->done will be called when complete
+ *   %-ESRCH: No user agent is available
+ *   %-ENOMEM: Memory allocation failed
+ */
+int tls_client_hello_x509(const struct tls_handshake_args *args, gfp_t flags)
+{
+	struct tls_handshake_req *treq;
+	struct handshake_req *req;
+
+	req = handshake_req_alloc(&tls_handshake_proto, flags);
+	if (!req)
+		return -ENOMEM;
+	treq = tls_handshake_req_init(req, args);
+	treq->th_type = HANDSHAKE_MSG_TYPE_CLIENTHELLO;
+	treq->th_auth_mode = HANDSHAKE_AUTH_X509;
+	treq->th_certificate = args->ta_my_cert;
+	treq->th_privkey = args->ta_my_privkey;
+
+	return handshake_req_submit(args->ta_sock, req, flags);
+}
+EXPORT_SYMBOL(tls_client_hello_x509);
+
+/**
+ * tls_client_hello_psk - request a PSK-based TLS handshake on a socket
+ * @args: socket and handshake parameters for this request
+ * @flags: memory allocation control flags
+ *
+ * Return values:
+ *   %0: Handshake request enqueue; ->done will be called when complete
+ *   %-EINVAL: Wrong number of local peer IDs
+ *   %-ESRCH: No user agent is available
+ *   %-ENOMEM: Memory allocation failed
+ */
+int tls_client_hello_psk(const struct tls_handshake_args *args, gfp_t flags)
+{
+	struct tls_handshake_req *treq;
+	struct handshake_req *req;
+	unsigned int i;
+
+	if (!args->ta_num_peerids ||
+	    args->ta_num_peerids > ARRAY_SIZE(treq->th_peerid))
+		return -EINVAL;
+
+	req = handshake_req_alloc(&tls_handshake_proto, flags);
+	if (!req)
+		return -ENOMEM;
+	treq = tls_handshake_req_init(req, args);
+	treq->th_type = HANDSHAKE_MSG_TYPE_CLIENTHELLO;
+	treq->th_auth_mode = HANDSHAKE_AUTH_PSK;
+	treq->th_num_peerids = args->ta_num_peerids;
+	for (i = 0; i < args->ta_num_peerids; i++)
+		treq->th_peerid[i] = args->ta_my_peerids[i];
+
+	return handshake_req_submit(args->ta_sock, req, flags);
+}
+EXPORT_SYMBOL(tls_client_hello_psk);
+
+/**
+ * tls_server_hello_x509 - request a server TLS handshake on a socket
+ * @args: socket and handshake parameters for this request
+ * @flags: memory allocation control flags
+ *
+ * Return values:
+ *   %0: Handshake request enqueue; ->done will be called when complete
+ *   %-ESRCH: No user agent is available
+ *   %-ENOMEM: Memory allocation failed
+ */
+int tls_server_hello_x509(const struct tls_handshake_args *args, gfp_t flags)
+{
+	struct tls_handshake_req *treq;
+	struct handshake_req *req;
+
+	req = handshake_req_alloc(&tls_handshake_proto, flags);
+	if (!req)
+		return -ENOMEM;
+	treq = tls_handshake_req_init(req, args);
+	treq->th_type = HANDSHAKE_MSG_TYPE_SERVERHELLO;
+	treq->th_auth_mode = HANDSHAKE_AUTH_X509;
+	treq->th_certificate = args->ta_my_cert;
+	treq->th_privkey = args->ta_my_privkey;
+
+	return handshake_req_submit(args->ta_sock, req, flags);
+}
+EXPORT_SYMBOL(tls_server_hello_x509);
+
+/**
+ * tls_server_hello_psk - request a server TLS handshake on a socket
+ * @args: socket and handshake parameters for this request
+ * @flags: memory allocation control flags
+ *
+ * Return values:
+ *   %0: Handshake request enqueue; ->done will be called when complete
+ *   %-ESRCH: No user agent is available
+ *   %-ENOMEM: Memory allocation failed
+ */
+int tls_server_hello_psk(const struct tls_handshake_args *args, gfp_t flags)
+{
+	struct tls_handshake_req *treq;
+	struct handshake_req *req;
+
+	req = handshake_req_alloc(&tls_handshake_proto, flags);
+	if (!req)
+		return -ENOMEM;
+	treq = tls_handshake_req_init(req, args);
+	treq->th_type = HANDSHAKE_MSG_TYPE_SERVERHELLO;
+	treq->th_auth_mode = HANDSHAKE_AUTH_PSK;
+	treq->th_num_peerids = 1;
+	treq->th_peerid[0] = args->ta_my_peerids[0];
+
+	return handshake_req_submit(args->ta_sock, req, flags);
+}
+EXPORT_SYMBOL(tls_server_hello_psk);
+
+/**
+ * tls_handshake_cancel - cancel a pending handshake
+ * @sock: socket on which there is an ongoing handshake
+ *
+ * Request cancellation races with request completion. To determine
+ * who won, callers examine the return value from this function.
+ *
+ * Return values:
+ *   %true - Uncompleted handshake request was canceled or not found
+ *   %false - Handshake request already completed
+ */
+bool tls_handshake_cancel(struct socket *sock)
+{
+	return handshake_req_cancel(sock);
+}
+EXPORT_SYMBOL(tls_handshake_cancel);



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v8 3/4] net/handshake: Add Kunit tests for the handshake consumer API
  2023-04-03 18:45 [PATCH v8 0/4] Another crack at a handshake upcall mechanism Chuck Lever
  2023-04-03 18:46 ` [PATCH v8 1/4] net/handshake: Create a NETLINK service for handling handshake requests Chuck Lever
  2023-04-03 18:46 ` [PATCH v8 2/4] net/handshake: Add a kernel API for requesting a TLSv1.3 handshake Chuck Lever
@ 2023-04-03 18:46 ` Chuck Lever
  2023-04-05  0:24   ` Jakub Kicinski
  2023-04-03 18:46 ` [PATCH v8 4/4] SUNRPC: Recognize control messages in server-side TCP socket code Chuck Lever
  3 siblings, 1 reply; 16+ messages in thread
From: Chuck Lever @ 2023-04-03 18:46 UTC (permalink / raw)
  To: kuba, pabeni, edumazet, borisp; +Cc: netdev, kernel-tls-handshake, john.haxby

From: Chuck Lever <chuck.lever@oracle.com>

These verify the API contracts and help exercise lifetime rules for
consumer sockets and handshake_req structures.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 net/Kconfig                |   15 +
 net/handshake/Makefile     |    2 
 net/handshake/kunit-test.c |  523 ++++++++++++++++++++++++++++++++++++++++++++
 net/handshake/netlink.c    |   19 ++
 4 files changed, 559 insertions(+)
 create mode 100644 net/handshake/kunit-test.c

diff --git a/net/Kconfig b/net/Kconfig
index 4b800706cc76..7d39c1773eb4 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -73,6 +73,21 @@ config NET_HANDSHAKE
 	depends on SUNRPC || NVME_TARGET_TCP || NVME_TCP
 	default y
 
+config NET_HANDSHAKE_KUNIT_TEST
+	tristate "KUnit tests for the handshake upcall mechanism" if !KUNIT_ALL_TESTS
+	default KUNIT_ALL_TESTS
+	depends on KUNIT
+	help
+	  This builds the KUnit tests for the handshake upcall mechanism.
+
+	  KUnit tests run during boot and output the results to the debug
+	  log in TAP format (https://testanything.org/). Only useful for
+	  kernel devs running KUnit test harness and are not for inclusion
+	  into a production build.
+
+	  For more information on KUnit and unit tests in general, refer
+	  to the KUnit documentation in Documentation/dev-tools/kunit/.
+
 config INET
 	bool "TCP/IP networking"
 	help
diff --git a/net/handshake/Makefile b/net/handshake/Makefile
index a089f7e3df24..d3b26a29118f 100644
--- a/net/handshake/Makefile
+++ b/net/handshake/Makefile
@@ -9,3 +9,5 @@
 
 obj-y += handshake.o
 handshake-y := genl.o netlink.o request.o tlshd.o trace.o
+
+obj-$(CONFIG_NET_HANDSHAKE_KUNIT_TEST) += kunit-test.o
diff --git a/net/handshake/kunit-test.c b/net/handshake/kunit-test.c
new file mode 100644
index 000000000000..50af0da6c485
--- /dev/null
+++ b/net/handshake/kunit-test.c
@@ -0,0 +1,523 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2023 Oracle and/or its affiliates.
+ *
+ * KUnit test of the handshake upcall mechanism.
+ */
+
+#include <kunit/test.h>
+#include <kunit/visibility.h>
+
+#include <linux/kernel.h>
+
+#include <net/sock.h>
+#include <net/genetlink.h>
+#include <net/netns/generic.h>
+
+#include <uapi/linux/handshake.h>
+#include "handshake.h"
+
+MODULE_IMPORT_NS(EXPORTED_FOR_KUNIT_TESTING);
+
+static int test_accept_func(struct handshake_req *req, struct genl_info *info,
+			    int fd)
+{
+	return 0;
+}
+
+static void test_done_func(struct handshake_req *req, unsigned int status,
+			   struct genl_info *info)
+{
+}
+
+struct handshake_req_alloc_test_param {
+	const char			*desc;
+	struct handshake_proto		*proto;
+	gfp_t				gfp;
+	bool				expect_success;
+};
+
+static struct handshake_proto handshake_req_alloc_proto_2 = {
+	.hp_handler_class	= HANDSHAKE_HANDLER_CLASS_NONE,
+};
+
+static struct handshake_proto handshake_req_alloc_proto_3 = {
+	.hp_handler_class	= HANDSHAKE_HANDLER_CLASS_MAX,
+};
+
+static struct handshake_proto handshake_req_alloc_proto_4 = {
+	.hp_handler_class	= HANDSHAKE_HANDLER_CLASS_TLSHD,
+};
+
+static struct handshake_proto handshake_req_alloc_proto_5 = {
+	.hp_handler_class	= HANDSHAKE_HANDLER_CLASS_TLSHD,
+	.hp_accept		= test_accept_func,
+};
+
+static struct handshake_proto handshake_req_alloc_proto_6 = {
+	.hp_handler_class	= HANDSHAKE_HANDLER_CLASS_TLSHD,
+	.hp_privsize		= UINT_MAX,
+	.hp_accept		= test_accept_func,
+	.hp_done		= test_done_func,
+};
+
+static struct handshake_proto handshake_req_alloc_proto_good = {
+	.hp_handler_class	= HANDSHAKE_HANDLER_CLASS_TLSHD,
+	.hp_accept		= test_accept_func,
+	.hp_done		= test_done_func,
+};
+
+static const
+struct handshake_req_alloc_test_param handshake_req_alloc_params[] = {
+	{
+		.desc			= "handshake_req_alloc NULL proto",
+		.proto			= NULL,
+		.gfp			= GFP_KERNEL,
+		.expect_success		= false,
+	},
+	{
+		.desc			= "handshake_req_alloc CLASS_NONE",
+		.proto			= &handshake_req_alloc_proto_2,
+		.gfp			= GFP_KERNEL,
+		.expect_success		= false,
+	},
+	{
+		.desc			= "handshake_req_alloc CLASS_MAX",
+		.proto			= &handshake_req_alloc_proto_3,
+		.gfp			= GFP_KERNEL,
+		.expect_success		= false,
+	},
+	{
+		.desc			= "handshake_req_alloc no callbacks",
+		.proto			= &handshake_req_alloc_proto_4,
+		.gfp			= GFP_KERNEL,
+		.expect_success		= false,
+	},
+	{
+		.desc			= "handshake_req_alloc no done callback",
+		.proto			= &handshake_req_alloc_proto_5,
+		.gfp			= GFP_KERNEL,
+		.expect_success		= false,
+	},
+	{
+		.desc			= "handshake_req_alloc excessive privsize",
+		.proto			= &handshake_req_alloc_proto_6,
+		.gfp			= GFP_KERNEL,
+		.expect_success		= false,
+	},
+	{
+		.desc			= "handshake_req_alloc all good",
+		.proto			= &handshake_req_alloc_proto_good,
+		.gfp			= GFP_KERNEL,
+		.expect_success		= true,
+	},
+};
+
+static void
+handshake_req_alloc_get_desc(const struct handshake_req_alloc_test_param *param,
+			     char *desc)
+{
+	strscpy(desc, param->desc, KUNIT_PARAM_DESC_SIZE);
+}
+
+/* Creates the function handshake_req_alloc_gen_params */
+KUNIT_ARRAY_PARAM(handshake_req_alloc, handshake_req_alloc_params,
+		  handshake_req_alloc_get_desc);
+
+static void handshake_req_alloc_case(struct kunit *test)
+{
+	const struct handshake_req_alloc_test_param *param = test->param_value;
+	struct handshake_req *result;
+
+	/* Arrange */
+
+	/* Act */
+	result = handshake_req_alloc(param->proto, param->gfp);
+
+	/* Assert */
+	if (param->expect_success)
+		KUNIT_EXPECT_NOT_NULL(test, result);
+	else
+		KUNIT_EXPECT_NULL(test, result);
+
+	kfree(result);
+}
+
+static void handshake_req_submit_test1(struct kunit *test)
+{
+	struct socket *sock;
+	int err, result;
+
+	/* Arrange */
+	err = __sock_create(&init_net, PF_INET, SOCK_STREAM, IPPROTO_TCP,
+			    &sock, 1);
+	KUNIT_ASSERT_EQ(test, err, 0);
+
+	/* Act */
+	result = handshake_req_submit(sock, NULL, GFP_KERNEL);
+
+	/* Assert */
+	KUNIT_EXPECT_EQ(test, result, -EINVAL);
+
+	sock_release(sock);
+}
+
+static void handshake_req_submit_test2(struct kunit *test)
+{
+	struct handshake_req *req;
+	int result;
+
+	/* Arrange */
+	req = handshake_req_alloc(&handshake_req_alloc_proto_good, GFP_KERNEL);
+	KUNIT_ASSERT_NOT_NULL(test, req);
+
+	/* Act */
+	result = handshake_req_submit(NULL, req, GFP_KERNEL);
+
+	/* Assert */
+	KUNIT_EXPECT_EQ(test, result, -EINVAL);
+
+	/* handshake_req_submit() destroys @req on error */
+}
+
+static void handshake_req_submit_test3(struct kunit *test)
+{
+	struct handshake_req *req;
+	struct socket *sock;
+	int err, result;
+
+	/* Arrange */
+	req = handshake_req_alloc(&handshake_req_alloc_proto_good, GFP_KERNEL);
+	KUNIT_ASSERT_NOT_NULL(test, req);
+
+	err = __sock_create(&init_net, PF_INET, SOCK_STREAM, IPPROTO_TCP,
+			    &sock, 1);
+	KUNIT_ASSERT_EQ(test, err, 0);
+	sock->file = NULL;
+
+	/* Act */
+	result = handshake_req_submit(sock, req, GFP_KERNEL);
+
+	/* Assert */
+	KUNIT_EXPECT_EQ(test, result, -EINVAL);
+
+	/* handshake_req_submit() destroys @req on error */
+	sock_release(sock);
+}
+
+static void handshake_req_submit_test4(struct kunit *test)
+{
+	struct handshake_req *req, *result;
+	struct socket *sock;
+	int err;
+
+	/* Arrange */
+	req = handshake_req_alloc(&handshake_req_alloc_proto_good, GFP_KERNEL);
+	KUNIT_ASSERT_NOT_NULL(test, req);
+
+	err = __sock_create(&init_net, PF_INET, SOCK_STREAM, IPPROTO_TCP,
+			    &sock, 1);
+	KUNIT_ASSERT_EQ(test, err, 0);
+	sock->file = sock_alloc_file(sock, O_NONBLOCK, NULL);
+	KUNIT_ASSERT_NOT_ERR_OR_NULL(test, sock->file);
+	KUNIT_ASSERT_NOT_NULL(test, sock->sk);
+
+	err = handshake_req_submit(sock, req, GFP_KERNEL);
+	KUNIT_ASSERT_EQ(test, err, 0);
+
+	/* Act */
+	result = handshake_req_hash_lookup(sock->sk);
+
+	/* Assert */
+	KUNIT_EXPECT_NOT_NULL(test, result);
+	KUNIT_EXPECT_PTR_EQ(test, req, result);
+
+	handshake_req_cancel(sock);
+	sock_release(sock);
+}
+
+static void handshake_req_submit_test5(struct kunit *test)
+{
+	struct handshake_req *req;
+	struct handshake_net *hn;
+	struct socket *sock;
+	struct net *net;
+	int saved, err;
+
+	/* Arrange */
+	req = handshake_req_alloc(&handshake_req_alloc_proto_good, GFP_KERNEL);
+	KUNIT_ASSERT_NOT_NULL(test, req);
+
+	err = __sock_create(&init_net, PF_INET, SOCK_STREAM, IPPROTO_TCP,
+			    &sock, 1);
+	KUNIT_ASSERT_EQ(test, err, 0);
+	sock->file = sock_alloc_file(sock, O_NONBLOCK, NULL);
+	KUNIT_ASSERT_NOT_ERR_OR_NULL(test, sock->file);
+	KUNIT_ASSERT_NOT_NULL(test, sock->sk);
+
+	net = sock_net(sock->sk);
+	hn = handshake_pernet(net);
+	KUNIT_ASSERT_NOT_NULL(test, hn);
+
+	saved = hn->hn_pending;
+	hn->hn_pending = hn->hn_pending_max + 1;
+
+	/* Act */
+	err = handshake_req_submit(sock, req, GFP_KERNEL);
+
+	/* Assert */
+	KUNIT_EXPECT_EQ(test, err, -EAGAIN);
+
+	sock_release(sock);
+	hn->hn_pending = saved;
+}
+
+static void handshake_req_submit_test6(struct kunit *test)
+{
+	struct handshake_req *req1, *req2;
+	struct socket *sock;
+	int err;
+
+	/* Arrange */
+	req1 = handshake_req_alloc(&handshake_req_alloc_proto_good, GFP_KERNEL);
+	KUNIT_ASSERT_NOT_NULL(test, req1);
+	req2 = handshake_req_alloc(&handshake_req_alloc_proto_good, GFP_KERNEL);
+	KUNIT_ASSERT_NOT_NULL(test, req2);
+
+	err = __sock_create(&init_net, PF_INET, SOCK_STREAM, IPPROTO_TCP,
+			    &sock, 1);
+	KUNIT_ASSERT_EQ(test, err, 0);
+	sock->file = sock_alloc_file(sock, O_NONBLOCK, NULL);
+	KUNIT_ASSERT_NOT_ERR_OR_NULL(test, sock->file);
+	KUNIT_ASSERT_NOT_NULL(test, sock->sk);
+
+	/* Act */
+	err = handshake_req_submit(sock, req1, GFP_KERNEL);
+	KUNIT_ASSERT_EQ(test, err, 0);
+	err = handshake_req_submit(sock, req2, GFP_KERNEL);
+
+	/* Assert */
+	KUNIT_EXPECT_EQ(test, err, -EBUSY);
+
+	handshake_req_cancel(sock);
+	sock_release(sock);
+}
+
+static void handshake_req_cancel_test1(struct kunit *test)
+{
+	struct handshake_req *req;
+	struct socket *sock;
+	bool result;
+	int err;
+
+	/* Arrange */
+	req = handshake_req_alloc(&handshake_req_alloc_proto_good, GFP_KERNEL);
+	KUNIT_ASSERT_NOT_NULL(test, req);
+
+	err = __sock_create(&init_net, PF_INET, SOCK_STREAM, IPPROTO_TCP,
+			    &sock, 1);
+	KUNIT_ASSERT_EQ(test, err, 0);
+
+	sock->file = sock_alloc_file(sock, O_NONBLOCK, NULL);
+	KUNIT_ASSERT_NOT_ERR_OR_NULL(test, sock->file);
+
+	err = handshake_req_submit(sock, req, GFP_KERNEL);
+	KUNIT_ASSERT_EQ(test, err, 0);
+
+	/* NB: handshake_req hasn't been accepted */
+
+	/* Act */
+	result = handshake_req_cancel(sock);
+
+	/* Assert */
+	KUNIT_EXPECT_TRUE(test, result);
+
+	sock_release(sock);
+}
+
+static void handshake_req_cancel_test2(struct kunit *test)
+{
+	struct handshake_req *req, *next;
+	struct handshake_net *hn;
+	struct socket *sock;
+	struct net *net;
+	bool result;
+	int err;
+
+	/* Arrange */
+	req = handshake_req_alloc(&handshake_req_alloc_proto_good, GFP_KERNEL);
+	KUNIT_ASSERT_NOT_NULL(test, req);
+
+	err = __sock_create(&init_net, PF_INET, SOCK_STREAM, IPPROTO_TCP,
+			    &sock, 1);
+	KUNIT_ASSERT_EQ(test, err, 0);
+
+	sock->file = sock_alloc_file(sock, O_NONBLOCK, NULL);
+	KUNIT_ASSERT_NOT_ERR_OR_NULL(test, sock->file);
+
+	err = handshake_req_submit(sock, req, GFP_KERNEL);
+	KUNIT_ASSERT_EQ(test, err, 0);
+
+	net = sock_net(sock->sk);
+	hn = handshake_pernet(net);
+	KUNIT_ASSERT_NOT_NULL(test, hn);
+
+	/* Pretend to accept this request */
+	next = handshake_req_next(hn, HANDSHAKE_HANDLER_CLASS_TLSHD);
+	KUNIT_ASSERT_PTR_EQ(test, req, next);
+
+	/* Act */
+	result = handshake_req_cancel(sock);
+
+	/* Assert */
+	KUNIT_EXPECT_TRUE(test, result);
+
+	sock_release(sock);
+}
+
+static void handshake_req_cancel_test3(struct kunit *test)
+{
+	struct handshake_req *req, *next;
+	struct handshake_net *hn;
+	struct socket *sock;
+	struct net *net;
+	bool result;
+	int err;
+
+	/* Arrange */
+	req = handshake_req_alloc(&handshake_req_alloc_proto_good, GFP_KERNEL);
+	KUNIT_ASSERT_NOT_NULL(test, req);
+
+	err = __sock_create(&init_net, PF_INET, SOCK_STREAM, IPPROTO_TCP,
+			    &sock, 1);
+	KUNIT_ASSERT_EQ(test, err, 0);
+
+	sock->file = sock_alloc_file(sock, O_NONBLOCK, NULL);
+	KUNIT_ASSERT_NOT_ERR_OR_NULL(test, sock->file);
+
+	err = handshake_req_submit(sock, req, GFP_KERNEL);
+	KUNIT_ASSERT_EQ(test, err, 0);
+
+	net = sock_net(sock->sk);
+	hn = handshake_pernet(net);
+	KUNIT_ASSERT_NOT_NULL(test, hn);
+
+	/* Pretend to accept this request */
+	next = handshake_req_next(hn, HANDSHAKE_HANDLER_CLASS_TLSHD);
+	KUNIT_ASSERT_PTR_EQ(test, req, next);
+
+	/* Pretend to complete this request */
+	handshake_complete(next, -ETIMEDOUT, NULL);
+
+	/* Act */
+	result = handshake_req_cancel(sock);
+
+	/* Assert */
+	KUNIT_EXPECT_FALSE(test, result);
+
+	sock_release(sock);
+}
+
+static struct handshake_req *handshake_req_destroy_test;
+
+static void test_destroy_func(struct handshake_req *req)
+{
+	handshake_req_destroy_test = req;
+}
+
+static struct handshake_proto handshake_req_alloc_proto_destroy = {
+	.hp_handler_class	= HANDSHAKE_HANDLER_CLASS_TLSHD,
+	.hp_accept		= test_accept_func,
+	.hp_done		= test_done_func,
+	.hp_destroy		= test_destroy_func,
+};
+
+static void handshake_req_destroy_test1(struct kunit *test)
+{
+	struct handshake_req *req;
+	struct socket *sock;
+	int err;
+
+	/* Arrange */
+	handshake_req_destroy_test = NULL;
+
+	req = handshake_req_alloc(&handshake_req_alloc_proto_destroy, GFP_KERNEL);
+	KUNIT_ASSERT_NOT_NULL(test, req);
+
+	err = __sock_create(&init_net, PF_INET, SOCK_STREAM, IPPROTO_TCP,
+			    &sock, 1);
+	KUNIT_ASSERT_EQ(test, err, 0);
+
+	sock->file = sock_alloc_file(sock, O_NONBLOCK, NULL);
+	KUNIT_ASSERT_NOT_ERR_OR_NULL(test, sock->file);
+
+	err = handshake_req_submit(sock, req, GFP_KERNEL);
+	KUNIT_ASSERT_EQ(test, err, 0);
+
+	handshake_req_cancel(sock);
+
+	/* Act */
+	sock_release(sock);
+
+	/* Assert */
+	KUNIT_EXPECT_PTR_EQ(test, handshake_req_destroy_test, req);
+}
+
+static struct kunit_case handshake_api_test_cases[] = {
+	{
+		.name			= "req_alloc API fuzzing",
+		.run_case		= handshake_req_alloc_case,
+		.generate_params	= handshake_req_alloc_gen_params,
+	},
+	{
+		.name			= "req_submit NULL req arg",
+		.run_case		= handshake_req_submit_test1,
+	},
+	{
+		.name			= "req_submit NULL sock arg",
+		.run_case		= handshake_req_submit_test2,
+	},
+	{
+		.name			= "req_submit NULL sock->file",
+		.run_case		= handshake_req_submit_test3,
+	},
+	{
+		.name			= "req_lookup works",
+		.run_case		= handshake_req_submit_test4,
+	},
+	{
+		.name			= "req_submit max pending",
+		.run_case		= handshake_req_submit_test5,
+	},
+	{
+		.name			= "req_submit multiple",
+		.run_case		= handshake_req_submit_test6,
+	},
+	{
+		.name			= "req_cancel before accept",
+		.run_case		= handshake_req_cancel_test1,
+	},
+	{
+		.name			= "req_cancel after accept",
+		.run_case		= handshake_req_cancel_test2,
+	},
+	{
+		.name			= "req_cancel after done",
+		.run_case		= handshake_req_cancel_test3,
+	},
+	{
+		.name			= "req_destroy works",
+		.run_case		= handshake_req_destroy_test1,
+	},
+	{}
+};
+
+static struct kunit_suite handshake_api_suite = {
+       .name                   = "Handshake API tests",
+       .test_cases             = handshake_api_test_cases,
+};
+
+kunit_test_suites(&handshake_api_suite);
+
+MODULE_DESCRIPTION("Test handshake upcall API functions");
+MODULE_LICENSE("GPL");
diff --git a/net/handshake/netlink.c b/net/handshake/netlink.c
index 0997fd4c6e0b..70765371b04e 100644
--- a/net/handshake/netlink.c
+++ b/net/handshake/netlink.c
@@ -24,6 +24,23 @@
 
 #include <trace/events/handshake.h>
 
+#if IS_ENABLED(CONFIG_KUNIT)
+
+/**
+ * handshake_genl_notify - Solve world hunger, tell no-one.
+ * @net: target network namespace
+ * @handler_class: target handler
+ * @flags: memory allocation control flags
+ *
+ * User agent generally isn't running during unit testing.
+ */
+int handshake_genl_notify(struct net *net, int handler_class, gfp_t flags)
+{
+	return 0;
+}
+
+#else
+
 /**
  * handshake_genl_notify - Notify handlers that a request is waiting
  * @net: target network namespace
@@ -64,6 +81,8 @@ int handshake_genl_notify(struct net *net, int handler_class, gfp_t flags)
 	return -EMSGSIZE;
 }
 
+#endif
+
 /**
  * handshake_genl_put - Create a generic netlink message header
  * @msg: buffer in which to create the header



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v8 4/4] SUNRPC: Recognize control messages in server-side TCP socket code
  2023-04-03 18:45 [PATCH v8 0/4] Another crack at a handshake upcall mechanism Chuck Lever
                   ` (2 preceding siblings ...)
  2023-04-03 18:46 ` [PATCH v8 3/4] net/handshake: Add Kunit tests for the handshake consumer API Chuck Lever
@ 2023-04-03 18:46 ` Chuck Lever
  2023-04-04 14:41   ` Chuck Lever
  3 siblings, 1 reply; 16+ messages in thread
From: Chuck Lever @ 2023-04-03 18:46 UTC (permalink / raw)
  To: kuba, pabeni, edumazet, borisp; +Cc: netdev, kernel-tls-handshake, john.haxby

From: Chuck Lever <chuck.lever@oracle.com>

To support kTLS, the server-side TCP socket code needs to watch for
CMSGs, just like on the client side.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 include/net/tls.h             |    2 ++
 include/trace/events/sunrpc.h |   39 +++++++++++++++++++++++++++++++++
 net/sunrpc/svcsock.c          |   49 +++++++++++++++++++++++++++++++++++++++--
 3 files changed, 88 insertions(+), 2 deletions(-)

diff --git a/include/net/tls.h b/include/net/tls.h
index 154949c7b0c8..6056ce5a2aa5 100644
--- a/include/net/tls.h
+++ b/include/net/tls.h
@@ -69,6 +69,8 @@ extern const struct tls_cipher_size_desc tls_cipher_size_desc[];
 
 #define TLS_CRYPTO_INFO_READY(info)	((info)->cipher_type)
 
+#define TLS_RECORD_TYPE_ALERT		0x15
+#define TLS_RECORD_TYPE_HANDSHAKE	0x16
 #define TLS_RECORD_TYPE_DATA		0x17
 
 #define TLS_AAD_SPACE_SIZE		13
diff --git a/include/trace/events/sunrpc.h b/include/trace/events/sunrpc.h
index 3ca54536f8f7..3ffb82956bd1 100644
--- a/include/trace/events/sunrpc.h
+++ b/include/trace/events/sunrpc.h
@@ -49,6 +49,19 @@ TRACE_DEFINE_ENUM(AF_INET6);
 		{ AF_INET,		"AF_INET" },		\
 		{ AF_INET6,		"AF_INET6" })
 
+/*
+ * From https://www.iana.org/assignments/tls-parameters/tls-parameters.xhtml
+ */
+#define rpc_show_tls_content_type(type) \
+	__print_symbolic(type, \
+		{ 20,		"change cipher spec" }, \
+		{ 21,		"alert" }, \
+		{ 22,		"handshake" }, \
+		{ 23,		"application data" }, \
+		{ 24,		"heartbeat" }, \
+		{ 25,		"tls12_cid" }, \
+		{ 26,		"ACK" })
+
 DECLARE_EVENT_CLASS(rpc_xdr_buf_class,
 	TP_PROTO(
 		const struct rpc_task *task,
@@ -2254,6 +2267,32 @@ DECLARE_EVENT_CLASS(svcsock_accept_class,
 DEFINE_ACCEPT_EVENT(accept);
 DEFINE_ACCEPT_EVENT(getpeername);
 
+TRACE_EVENT(svcsock_tls_ctype,
+	TP_PROTO(
+		const struct svc_xprt *xprt,
+		unsigned char ctype
+	),
+
+	TP_ARGS(xprt, ctype),
+
+	TP_STRUCT__entry(
+		SVC_XPRT_ENDPOINT_FIELDS(xprt)
+
+		__field(unsigned long, ctype)
+	),
+
+	TP_fast_assign(
+		SVC_XPRT_ENDPOINT_ASSIGNMENTS(xprt);
+
+		__entry->ctype = ctype;
+	),
+
+	TP_printk(SVC_XPRT_ENDPOINT_FORMAT " %s",
+		SVC_XPRT_ENDPOINT_VARARGS,
+		rpc_show_tls_content_type(__entry->ctype)
+	)
+);
+
 DECLARE_EVENT_CLASS(cache_event,
 	TP_PROTO(
 		const struct cache_detail *cd,
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 03a4f5615086..c8cbe3b5182d 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -43,6 +43,7 @@
 #include <net/udp.h>
 #include <net/tcp.h>
 #include <net/tcp_states.h>
+#include <net/tls.h>
 #include <linux/uaccess.h>
 #include <linux/highmem.h>
 #include <asm/ioctls.h>
@@ -216,6 +217,50 @@ static int svc_one_sock_name(struct svc_sock *svsk, char *buf, int remaining)
 	return len;
 }
 
+static int
+svc_tcp_sock_process_cmsg(struct svc_sock *svsk, struct msghdr *msg,
+			  struct cmsghdr *cmsg, int ret)
+{
+	if (cmsg->cmsg_level == SOL_TLS &&
+	    cmsg->cmsg_type == TLS_GET_RECORD_TYPE) {
+		u8 content_type = *((u8 *)CMSG_DATA(cmsg));
+
+		trace_svcsock_tls_ctype(&svsk->sk_xprt, content_type);
+		switch (content_type) {
+		case TLS_RECORD_TYPE_DATA:
+			/* TLS sets EOR at the end of each application data
+			 * record, even though there might be more frames
+			 * waiting to be decrypted.
+			 */
+			msg->msg_flags &= ~MSG_EOR;
+			break;
+		case TLS_RECORD_TYPE_ALERT:
+			ret = -ENOTCONN;
+			break;
+		default:
+			ret = -EAGAIN;
+		}
+	}
+	return ret;
+}
+
+static int
+svc_tcp_sock_recv_cmsg(struct svc_sock *svsk, struct msghdr *msg)
+{
+	union {
+		struct cmsghdr	cmsg;
+		u8		buf[CMSG_SPACE(sizeof(u8))];
+	} u;
+	int ret;
+
+	msg->msg_control = &u;
+	msg->msg_controllen = sizeof(u);
+	ret = sock_recvmsg(svsk->sk_sock, msg, MSG_DONTWAIT);
+	if (unlikely(msg->msg_controllen != sizeof(u)))
+		ret = svc_tcp_sock_process_cmsg(svsk, msg, &u.cmsg, ret);
+	return ret;
+}
+
 #if ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE
 static void svc_flush_bvec(const struct bio_vec *bvec, size_t size, size_t seek)
 {
@@ -263,7 +308,7 @@ static ssize_t svc_tcp_read_msg(struct svc_rqst *rqstp, size_t buflen,
 		iov_iter_advance(&msg.msg_iter, seek);
 		buflen -= seek;
 	}
-	len = sock_recvmsg(svsk->sk_sock, &msg, MSG_DONTWAIT);
+	len = svc_tcp_sock_recv_cmsg(svsk, &msg);
 	if (len > 0)
 		svc_flush_bvec(bvec, len, seek);
 
@@ -877,7 +922,7 @@ static ssize_t svc_tcp_read_marker(struct svc_sock *svsk,
 		iov.iov_base = ((char *)&svsk->sk_marker) + svsk->sk_tcplen;
 		iov.iov_len  = want;
 		iov_iter_kvec(&msg.msg_iter, ITER_DEST, &iov, 1, want);
-		len = sock_recvmsg(svsk->sk_sock, &msg, MSG_DONTWAIT);
+		len = svc_tcp_sock_recv_cmsg(svsk, &msg);
 		if (len < 0)
 			return len;
 		svsk->sk_tcplen += len;



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v8 4/4] SUNRPC: Recognize control messages in server-side TCP socket code
  2023-04-03 18:46 ` [PATCH v8 4/4] SUNRPC: Recognize control messages in server-side TCP socket code Chuck Lever
@ 2023-04-04 14:41   ` Chuck Lever
  0 siblings, 0 replies; 16+ messages in thread
From: Chuck Lever @ 2023-04-04 14:41 UTC (permalink / raw)
  To: kuba, pabeni, edumazet, borisp; +Cc: netdev, kernel-tls-handshake, john.haxby

On Mon, Apr 03, 2023 at 02:46:22PM -0400, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
> 
> To support kTLS, the server-side TCP socket code needs to watch for
> CMSGs, just like on the client side.
> 
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>  include/net/tls.h             |    2 ++
>  include/trace/events/sunrpc.h |   39 +++++++++++++++++++++++++++++++++
>  net/sunrpc/svcsock.c          |   49 +++++++++++++++++++++++++++++++++++++++--
>  3 files changed, 88 insertions(+), 2 deletions(-)
> 
> diff --git a/include/net/tls.h b/include/net/tls.h
> index 154949c7b0c8..6056ce5a2aa5 100644
> --- a/include/net/tls.h
> +++ b/include/net/tls.h
> @@ -69,6 +69,8 @@ extern const struct tls_cipher_size_desc tls_cipher_size_desc[];
>  
>  #define TLS_CRYPTO_INFO_READY(info)	((info)->cipher_type)
>  
> +#define TLS_RECORD_TYPE_ALERT		0x15
> +#define TLS_RECORD_TYPE_HANDSHAKE	0x16
>  #define TLS_RECORD_TYPE_DATA		0x17
>  
>  #define TLS_AAD_SPACE_SIZE		13

The above hunk is why I included this patch in the repost of this
series. Boris, I can take this bit through the nfsd tree if you (or
John F) Ack it.

Please let me know how you'd prefer to handle it.


> diff --git a/include/trace/events/sunrpc.h b/include/trace/events/sunrpc.h
> index 3ca54536f8f7..3ffb82956bd1 100644
> --- a/include/trace/events/sunrpc.h
> +++ b/include/trace/events/sunrpc.h
> @@ -49,6 +49,19 @@ TRACE_DEFINE_ENUM(AF_INET6);
>  		{ AF_INET,		"AF_INET" },		\
>  		{ AF_INET6,		"AF_INET6" })
>  
> +/*
> + * From https://www.iana.org/assignments/tls-parameters/tls-parameters.xhtml
> + */
> +#define rpc_show_tls_content_type(type) \
> +	__print_symbolic(type, \
> +		{ 20,		"change cipher spec" }, \
> +		{ 21,		"alert" }, \
> +		{ 22,		"handshake" }, \
> +		{ 23,		"application data" }, \
> +		{ 24,		"heartbeat" }, \
> +		{ 25,		"tls12_cid" }, \
> +		{ 26,		"ACK" })
> +
>  DECLARE_EVENT_CLASS(rpc_xdr_buf_class,
>  	TP_PROTO(
>  		const struct rpc_task *task,
> @@ -2254,6 +2267,32 @@ DECLARE_EVENT_CLASS(svcsock_accept_class,
>  DEFINE_ACCEPT_EVENT(accept);
>  DEFINE_ACCEPT_EVENT(getpeername);
>  
> +TRACE_EVENT(svcsock_tls_ctype,
> +	TP_PROTO(
> +		const struct svc_xprt *xprt,
> +		unsigned char ctype
> +	),
> +
> +	TP_ARGS(xprt, ctype),
> +
> +	TP_STRUCT__entry(
> +		SVC_XPRT_ENDPOINT_FIELDS(xprt)
> +
> +		__field(unsigned long, ctype)
> +	),
> +
> +	TP_fast_assign(
> +		SVC_XPRT_ENDPOINT_ASSIGNMENTS(xprt);
> +
> +		__entry->ctype = ctype;
> +	),
> +
> +	TP_printk(SVC_XPRT_ENDPOINT_FORMAT " %s",
> +		SVC_XPRT_ENDPOINT_VARARGS,
> +		rpc_show_tls_content_type(__entry->ctype)
> +	)
> +);
> +
>  DECLARE_EVENT_CLASS(cache_event,
>  	TP_PROTO(
>  		const struct cache_detail *cd,
> diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
> index 03a4f5615086..c8cbe3b5182d 100644
> --- a/net/sunrpc/svcsock.c
> +++ b/net/sunrpc/svcsock.c
> @@ -43,6 +43,7 @@
>  #include <net/udp.h>
>  #include <net/tcp.h>
>  #include <net/tcp_states.h>
> +#include <net/tls.h>
>  #include <linux/uaccess.h>
>  #include <linux/highmem.h>
>  #include <asm/ioctls.h>
> @@ -216,6 +217,50 @@ static int svc_one_sock_name(struct svc_sock *svsk, char *buf, int remaining)
>  	return len;
>  }
>  
> +static int
> +svc_tcp_sock_process_cmsg(struct svc_sock *svsk, struct msghdr *msg,
> +			  struct cmsghdr *cmsg, int ret)
> +{
> +	if (cmsg->cmsg_level == SOL_TLS &&
> +	    cmsg->cmsg_type == TLS_GET_RECORD_TYPE) {
> +		u8 content_type = *((u8 *)CMSG_DATA(cmsg));
> +
> +		trace_svcsock_tls_ctype(&svsk->sk_xprt, content_type);
> +		switch (content_type) {
> +		case TLS_RECORD_TYPE_DATA:
> +			/* TLS sets EOR at the end of each application data
> +			 * record, even though there might be more frames
> +			 * waiting to be decrypted.
> +			 */
> +			msg->msg_flags &= ~MSG_EOR;
> +			break;
> +		case TLS_RECORD_TYPE_ALERT:
> +			ret = -ENOTCONN;
> +			break;
> +		default:
> +			ret = -EAGAIN;
> +		}
> +	}
> +	return ret;
> +}
> +
> +static int
> +svc_tcp_sock_recv_cmsg(struct svc_sock *svsk, struct msghdr *msg)
> +{
> +	union {
> +		struct cmsghdr	cmsg;
> +		u8		buf[CMSG_SPACE(sizeof(u8))];
> +	} u;
> +	int ret;
> +
> +	msg->msg_control = &u;
> +	msg->msg_controllen = sizeof(u);
> +	ret = sock_recvmsg(svsk->sk_sock, msg, MSG_DONTWAIT);
> +	if (unlikely(msg->msg_controllen != sizeof(u)))
> +		ret = svc_tcp_sock_process_cmsg(svsk, msg, &u.cmsg, ret);
> +	return ret;
> +}
> +
>  #if ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE
>  static void svc_flush_bvec(const struct bio_vec *bvec, size_t size, size_t seek)
>  {
> @@ -263,7 +308,7 @@ static ssize_t svc_tcp_read_msg(struct svc_rqst *rqstp, size_t buflen,
>  		iov_iter_advance(&msg.msg_iter, seek);
>  		buflen -= seek;
>  	}
> -	len = sock_recvmsg(svsk->sk_sock, &msg, MSG_DONTWAIT);
> +	len = svc_tcp_sock_recv_cmsg(svsk, &msg);
>  	if (len > 0)
>  		svc_flush_bvec(bvec, len, seek);
>  
> @@ -877,7 +922,7 @@ static ssize_t svc_tcp_read_marker(struct svc_sock *svsk,
>  		iov.iov_base = ((char *)&svsk->sk_marker) + svsk->sk_tcplen;
>  		iov.iov_len  = want;
>  		iov_iter_kvec(&msg.msg_iter, ITER_DEST, &iov, 1, want);
> -		len = sock_recvmsg(svsk->sk_sock, &msg, MSG_DONTWAIT);
> +		len = svc_tcp_sock_recv_cmsg(svsk, &msg);
>  		if (len < 0)
>  			return len;
>  		svsk->sk_tcplen += len;
> 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v8 1/4] net/handshake: Create a NETLINK service for handling handshake requests
  2023-04-03 18:46 ` [PATCH v8 1/4] net/handshake: Create a NETLINK service for handling handshake requests Chuck Lever
@ 2023-04-04 15:36   ` Chuck Lever III
  2023-04-04 15:44     ` Hannes Reinecke
  2023-04-05  0:23   ` Jakub Kicinski
  1 sibling, 1 reply; 16+ messages in thread
From: Chuck Lever III @ 2023-04-04 15:36 UTC (permalink / raw)
  To: Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Boris Pismenny, open list:NETWORKING [GENERAL],
	kernel-tls-handshake, John Haxby



> On Apr 3, 2023, at 2:46 PM, Chuck Lever <cel@kernel.org> wrote:
> 
> diff --git a/net/handshake/request.c b/net/handshake/request.c
> new file mode 100644
> index 000000000000..0ef18a38c047
> --- /dev/null
> +++ b/net/handshake/request.c
> @@ -0,0 +1,344 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Handshake request lifetime events
> + *
> + * Author: Chuck Lever <chuck.lever@oracle.com>
> + *
> + * Copyright (c) 2023, Oracle and/or its affiliates.
> + */
> +
> +#include <linux/types.h>
> +#include <linux/socket.h>
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/skbuff.h>
> +#include <linux/inet.h>
> +#include <linux/fdtable.h>
> +#include <linux/rhashtable.h>
> +
> +#include <net/sock.h>
> +#include <net/genetlink.h>
> +#include <net/netns/generic.h>
> +
> +#include <uapi/linux/handshake.h>
> +#include "handshake.h"
> +
> +#include <trace/events/handshake.h>
> +
> +/*
> + * We need both a handshake_req -> sock mapping, and a sock ->
> + * handshake_req mapping. Both are one-to-one.
> + *
> + * To avoid adding another pointer field to struct sock, net/handshake
> + * maintains a hash table, indexed by the memory address of @sock, to
> + * find the struct handshake_req outstanding for that socket. The
> + * reverse direction uses a simple pointer field in the handshake_req
> + * struct.
> + */
> +
> +static struct rhashtable handshake_rhashtbl ____cacheline_aligned_in_smp;
> +
> +static const struct rhashtable_params handshake_rhash_params = {
> + .key_len = sizeof_field(struct handshake_req, hr_sk),
> + .key_offset = offsetof(struct handshake_req, hr_sk),
> + .head_offset = offsetof(struct handshake_req, hr_rhash),
> + .automatic_shrinking = true,
> +};
> +
> +int handshake_req_hash_init(void)
> +{
> + return rhashtable_init(&handshake_rhashtbl, &handshake_rhash_params);
> +}
> +
> +void handshake_req_hash_destroy(void)
> +{
> + rhashtable_destroy(&handshake_rhashtbl);
> +}
> +
> +struct handshake_req *handshake_req_hash_lookup(struct sock *sk)
> +{
> + return rhashtable_lookup_fast(&handshake_rhashtbl, &sk,
> +      handshake_rhash_params);
> +}
> +
> +static bool handshake_req_hash_add(struct handshake_req *req)
> +{
> + int ret;
> +
> + ret = rhashtable_lookup_insert_fast(&handshake_rhashtbl,
> +    &req->hr_rhash,
> +    handshake_rhash_params);
> + return ret == 0;
> +}
> +
> +static void handshake_req_destroy(struct handshake_req *req)
> +{
> + if (req->hr_proto->hp_destroy)
> + req->hr_proto->hp_destroy(req);
> + rhashtable_remove_fast(&handshake_rhashtbl, &req->hr_rhash,
> +       handshake_rhash_params);
> + kfree(req);
> +}
> +
> +static void handshake_sk_destruct(struct sock *sk)
> +{
> + void (*sk_destruct)(struct sock *sk);
> + struct handshake_req *req;
> +
> + req = handshake_req_hash_lookup(sk);
> + if (!req)
> + return;
> +
> + trace_handshake_destruct(sock_net(sk), req, sk);
> + sk_destruct = req->hr_odestruct;
> + handshake_req_destroy(req);
> + if (sk_destruct)
> + sk_destruct(sk);
> +}
> +
> +/**
> + * handshake_req_alloc - Allocate a handshake request
> + * @proto: security protocol
> + * @flags: memory allocation flags
> + *
> + * Returns an initialized handshake_req or NULL.
> + */
> +struct handshake_req *handshake_req_alloc(const struct handshake_proto *proto,
> +  gfp_t flags)
> +{
> + struct handshake_req *req;
> +
> + if (!proto)
> + return NULL;
> + if (proto->hp_handler_class <= HANDSHAKE_HANDLER_CLASS_NONE)
> + return NULL;
> + if (proto->hp_handler_class >= HANDSHAKE_HANDLER_CLASS_MAX)
> + return NULL;
> + if (!proto->hp_accept || !proto->hp_done)
> + return NULL;
> +
> + req = kzalloc(struct_size(req, hr_priv, proto->hp_privsize), flags);
> + if (!req)
> + return NULL;
> +
> + INIT_LIST_HEAD(&req->hr_list);
> + req->hr_proto = proto;
> + return req;
> +}
> +EXPORT_SYMBOL(handshake_req_alloc);
> +
> +/**
> + * handshake_req_private - Get per-handshake private data
> + * @req: handshake arguments
> + *
> + */
> +void *handshake_req_private(struct handshake_req *req)
> +{
> + return (void *)&req->hr_priv;
> +}
> +EXPORT_SYMBOL(handshake_req_private);
> +
> +static bool __add_pending_locked(struct handshake_net *hn,
> + struct handshake_req *req)
> +{
> + if (WARN_ON_ONCE(!list_empty(&req->hr_list)))
> + return false;
> + hn->hn_pending++;
> + list_add_tail(&req->hr_list, &hn->hn_requests);
> + return true;
> +}
> +
> +static void __remove_pending_locked(struct handshake_net *hn,
> +    struct handshake_req *req)
> +{
> + hn->hn_pending--;
> + list_del_init(&req->hr_list);
> +}
> +
> +/*
> + * Returns %true if the request was found on @net's pending list,
> + * otherwise %false.
> + *
> + * If @req was on a pending list, it has not yet been accepted.
> + */
> +static bool remove_pending(struct handshake_net *hn, struct handshake_req *req)
> +{
> + bool ret = false;
> +
> + spin_lock(&hn->hn_lock);
> + if (!list_empty(&req->hr_list)) {
> + __remove_pending_locked(hn, req);
> + ret = true;
> + }
> + spin_unlock(&hn->hn_lock);
> +
> + return ret;
> +}
> +
> +struct handshake_req *handshake_req_next(struct handshake_net *hn, int class)
> +{
> + struct handshake_req *req, *pos;
> +
> + req = NULL;
> + spin_lock(&hn->hn_lock);
> + list_for_each_entry(pos, &hn->hn_requests, hr_list) {
> + if (pos->hr_proto->hp_handler_class != class)
> + continue;
> + __remove_pending_locked(hn, pos);
> + req = pos;
> + break;
> + }
> + spin_unlock(&hn->hn_lock);
> +
> + return req;
> +}
> +
> +/**
> + * handshake_req_submit - Submit a handshake request
> + * @sock: open socket on which to perform the handshake
> + * @req: handshake arguments
> + * @flags: memory allocation flags
> + *
> + * Return values:
> + *   %0: Request queued
> + *   %-EINVAL: Invalid argument
> + *   %-EBUSY: A handshake is already under way for this socket
> + *   %-ESRCH: No handshake agent is available
> + *   %-EAGAIN: Too many pending handshake requests
> + *   %-ENOMEM: Failed to allocate memory
> + *   %-EMSGSIZE: Failed to construct notification message
> + *   %-EOPNOTSUPP: Handshake module not initialized
> + *
> + * A zero return value from handshake_req_submit() means that
> + * exactly one subsequent completion callback is guaranteed.
> + *
> + * A negative return value from handshake_req_submit() means that
> + * no completion callback will be done and that @req has been
> + * destroyed.
> + */
> +int handshake_req_submit(struct socket *sock, struct handshake_req *req,
> + gfp_t flags)
> +{
> + struct handshake_net *hn;
> + struct net *net;
> + int ret;
> +
> + if (!sock || !req || !sock->file) {
> + kfree(req);
> + return -EINVAL;
> + }
> +
> + req->hr_sk = sock->sk;
> + if (!req->hr_sk) {
> + kfree(req);
> + return -EINVAL;
> + }
> + req->hr_odestruct = req->hr_sk->sk_destruct;
> + req->hr_sk->sk_destruct = handshake_sk_destruct;
> +
> + ret = -EOPNOTSUPP;
> + net = sock_net(req->hr_sk);
> + hn = handshake_pernet(net);
> + if (!hn)
> + goto out_err;
> +
> + ret = -EAGAIN;
> + if (READ_ONCE(hn->hn_pending) >= hn->hn_pending_max)
> + goto out_err;
> +
> + spin_lock(&hn->hn_lock);
> + ret = -EOPNOTSUPP;
> + if (test_bit(HANDSHAKE_F_NET_DRAINING, &hn->hn_flags))
> + goto out_unlock;
> + ret = -EBUSY;
> + if (!handshake_req_hash_add(req))
> + goto out_unlock;
> + if (!__add_pending_locked(hn, req))
> + goto out_unlock;
> + spin_unlock(&hn->hn_lock);
> +
> + ret = handshake_genl_notify(net, req->hr_proto->hp_handler_class,
> +    flags);
> + if (ret) {
> + trace_handshake_notify_err(net, req, req->hr_sk, ret);
> + if (remove_pending(hn, req))
> + goto out_err;
> + }
> +
> + /* Prevent socket release while a handshake request is pending */
> + sock_hold(req->hr_sk);
> +
> + trace_handshake_submit(net, req, req->hr_sk);
> + return 0;
> +
> +out_unlock:
> + spin_unlock(&hn->hn_lock);
> +out_err:
> + trace_handshake_submit_err(net, req, req->hr_sk, ret);
> + handshake_req_destroy(req);
> + return ret;
> +}
> +EXPORT_SYMBOL(handshake_req_submit);
> +
> +void handshake_complete(struct handshake_req *req, unsigned int status,
> + struct genl_info *info)
> +{
> + struct sock *sk = req->hr_sk;
> + struct net *net = sock_net(sk);
> +
> + if (!test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED, &req->hr_flags)) {
> + trace_handshake_complete(net, req, sk, status);
> + req->hr_proto->hp_done(req, status, info);
> +
> + /* Handshake request is no longer pending */
> + sock_put(sk);
> + }
> +}
> +
> +/**
> + * handshake_req_cancel - Cancel an in-progress handshake
> + * @sock: socket on which there is an ongoing handshake
> + *
> + * Request cancellation races with request completion. To determine
> + * who won, callers examine the return value from this function.
> + *
> + * Return values:
> + *   %true - Uncompleted handshake request was canceled or not found
> + *   %false - Handshake request already completed
> + */
> +bool handshake_req_cancel(struct socket *sock)
> +{
> + struct handshake_req *req;
> + struct handshake_net *hn;
> + struct sock *sk;
> + struct net *net;
> +
> + sk = sock->sk;
> + net = sock_net(sk);

We're still seeing NULL pointer dereferences here.
Typically this happens after the remote closes the
connection early.

I guess I cannot rely on sock_hold(sk); from preventing
someone from doing a "sock->sk = NULL;"

Would it make more sense for req_submit and req_cancel to
operate on "struct sock *" rather than "struct socket *" ?


> + req = handshake_req_hash_lookup(sk);
> + if (!req) {
> + trace_handshake_cancel_none(net, req, sk);
> + return true;
> + }
> +
> + hn = handshake_pernet(net);
> + if (hn && remove_pending(hn, req)) {
> + /* Request hadn't been accepted */
> + trace_handshake_cancel(net, req, sk);
> + sock_put(sk);
> + return true;
> + }
> + if (test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED, &req->hr_flags)) {
> + /* Request already completed */
> + trace_handshake_cancel_busy(net, req, sk);
> + return false;
> + }
> +
> + trace_handshake_cancel(net, req, sk);
> +
> + /* Handshake request is no longer pending */
> + sock_put(sk);
> +
> + return true;
> +}
> +EXPORT_SYMBOL(handshake_req_cancel);


--
Chuck Lever



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v8 1/4] net/handshake: Create a NETLINK service for handling handshake requests
  2023-04-04 15:36   ` Chuck Lever III
@ 2023-04-04 15:44     ` Hannes Reinecke
  2023-04-05  0:00       ` Jakub Kicinski
  0 siblings, 1 reply; 16+ messages in thread
From: Hannes Reinecke @ 2023-04-04 15:44 UTC (permalink / raw)
  To: Chuck Lever III, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Boris Pismenny, open list:NETWORKING [GENERAL],
	kernel-tls-handshake, John Haxby

On 4/4/23 17:36, Chuck Lever III wrote:
> 
> 
>> On Apr 3, 2023, at 2:46 PM, Chuck Lever <cel@kernel.org> wrote:
>>
[ .. ]
>> +/**
>> + * handshake_req_cancel - Cancel an in-progress handshake
>> + * @sock: socket on which there is an ongoing handshake
>> + *
>> + * Request cancellation races with request completion. To determine
>> + * who won, callers examine the return value from this function.
>> + *
>> + * Return values:
>> + *   %true - Uncompleted handshake request was canceled or not found
>> + *   %false - Handshake request already completed
>> + */
>> +bool handshake_req_cancel(struct socket *sock)
>> +{
>> + struct handshake_req *req;
>> + struct handshake_net *hn;
>> + struct sock *sk;
>> + struct net *net;
>> +
>> + sk = sock->sk;
>> + net = sock_net(sk);
> 
> We're still seeing NULL pointer dereferences here.
> Typically this happens after the remote closes the
> connection early.
> 
> I guess I cannot rely on sock_hold(sk); from preventing
> someone from doing a "sock->sk = NULL;"
> 
> Would it make more sense for req_submit and req_cancel to
> operate on "struct sock *" rather than "struct socket *" ?
> 
Stumbled across that one, too; that's why my initial submission
was sprinkled with 'if (!sock->sk)' statements.
So I think it's a good idea.

But waiting for Jakub to enlighten us.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v8 1/4] net/handshake: Create a NETLINK service for handling handshake requests
  2023-04-04 15:44     ` Hannes Reinecke
@ 2023-04-05  0:00       ` Jakub Kicinski
  2023-04-05  6:32         ` Hannes Reinecke
  0 siblings, 1 reply; 16+ messages in thread
From: Jakub Kicinski @ 2023-04-05  0:00 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Chuck Lever III, Paolo Abeni, Eric Dumazet, Boris Pismenny,
	open list:NETWORKING [GENERAL],
	kernel-tls-handshake, John Haxby

On Tue, 4 Apr 2023 17:44:19 +0200 Hannes Reinecke wrote:
> > We're still seeing NULL pointer dereferences here.
> > Typically this happens after the remote closes the
> > connection early.
> > 
> > I guess I cannot rely on sock_hold(sk); from preventing
> > someone from doing a "sock->sk = NULL;"
> > 
> > Would it make more sense for req_submit and req_cancel to
> > operate on "struct sock *" rather than "struct socket *" ?
> >   
> Stumbled across that one, too; that's why my initial submission
> was sprinkled with 'if (!sock->sk)' statements.
> So I think it's a good idea.
> 
> But waiting for Jakub to enlighten us.

Ah, I'm probably the weakest of the netdev maintainers when it comes 
to the socket layer :)

I thought sock->sk is only cleared if the "user" of the socket closes
it. But yes, both sock->sk == NULL and sk->sk_socket == NULL are
entirely possible, and the networking stack usually operates on 
struct sock. Why exactly those two are separate beings is one of 
the mysteries of Linux networking which causes guaranteed confusion 
to all the newcomers. I wish I knew the details so I could at least
document it :S

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v8 1/4] net/handshake: Create a NETLINK service for handling handshake requests
  2023-04-03 18:46 ` [PATCH v8 1/4] net/handshake: Create a NETLINK service for handling handshake requests Chuck Lever
  2023-04-04 15:36   ` Chuck Lever III
@ 2023-04-05  0:23   ` Jakub Kicinski
  2023-04-05  0:44     ` Chuck Lever III
  1 sibling, 1 reply; 16+ messages in thread
From: Jakub Kicinski @ 2023-04-05  0:23 UTC (permalink / raw)
  To: Chuck Lever
  Cc: pabeni, edumazet, borisp, netdev, kernel-tls-handshake, john.haxby

On Mon, 03 Apr 2023 14:46:02 -0400 Chuck Lever wrote:
> +int handshake_nl_done_doit(struct sk_buff *skb, struct genl_info *info)
> +{
> +	struct net *net = sock_net(skb->sk);
> +	struct socket *sock = NULL;
> +	struct handshake_req *req;
> +	int fd, status, err;
> +
> +	if (GENL_REQ_ATTR_CHECK(info, HANDSHAKE_A_DONE_SOCKFD))
> +		return -EINVAL;
> +	fd = nla_get_u32(info->attrs[HANDSHAKE_A_DONE_SOCKFD]);
> +
> +	err = 0;
> +	sock = sockfd_lookup(fd, &err);
> +	if (err) {
> +		err = -EBADF;
> +		goto out_status;
> +	}
> +
> +	req = handshake_req_hash_lookup(sock->sk);
> +	if (!req) {

fput() missing on this path?

> +		err = -EBUSY;
> +		goto out_status;
> +	}
> +
> +	trace_handshake_cmd_done(net, req, sock->sk, fd);
> +
> +	status = -EIO;
> +	if (info->attrs[HANDSHAKE_A_DONE_STATUS])
> +		status = nla_get_u32(info->attrs[HANDSHAKE_A_DONE_STATUS]);
> +
> +	handshake_complete(req, status, info);
> +	fput(sock->file);
> +	return 0;
> +
> +out_status:
> +	trace_handshake_cmd_done_err(net, req, sock->sk, err);
> +	return err;
> +}

> +	/*
> +	 * Arbitrary limit to prevent handshakes that do not make
> +	 * progress from clogging up the system.
> +	 */
> +	si_meminfo(&si);
> +	tmp = si.totalram / (25 * si.mem_unit);
> +	hn->hn_pending_max = clamp(tmp, 3UL, 25UL);

No idea what this does (what's mem_unit?), we'll have to trust you :)


And there are some kdoc issues here:

include/trace/events/handshake.h:112: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
 ** Request lifetime events
include/trace/events/handshake.h:149: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v8 3/4] net/handshake: Add Kunit tests for the handshake consumer API
  2023-04-03 18:46 ` [PATCH v8 3/4] net/handshake: Add Kunit tests for the handshake consumer API Chuck Lever
@ 2023-04-05  0:24   ` Jakub Kicinski
  2023-04-05  0:45     ` Chuck Lever III
  0 siblings, 1 reply; 16+ messages in thread
From: Jakub Kicinski @ 2023-04-05  0:24 UTC (permalink / raw)
  To: Chuck Lever
  Cc: pabeni, edumazet, borisp, netdev, kernel-tls-handshake, john.haxby

On Mon, 03 Apr 2023 14:46:15 -0400 Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
> 
> These verify the API contracts and help exercise lifetime rules for
> consumer sockets and handshake_req structures.

Does it build with allmodconfig for you?

error: the following would cause module name conflict:
  lib/kunit/kunit-test.ko
  net/handshake/kunit-test.ko

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v8 1/4] net/handshake: Create a NETLINK service for handling handshake requests
  2023-04-05  0:23   ` Jakub Kicinski
@ 2023-04-05  0:44     ` Chuck Lever III
       [not found]       ` <20230404183233.78895cf2@kernel.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Chuck Lever III @ 2023-04-05  0:44 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Chuck Lever, Paolo Abeni, Eric Dumazet, Boris Pismenny,
	open list:NETWORKING [GENERAL],
	kernel-tls-handshake, John Haxby


> On Apr 4, 2023, at 8:23 PM, Jakub Kicinski <kuba@kernel.org> wrote:
> 
> On Mon, 03 Apr 2023 14:46:02 -0400 Chuck Lever wrote:
>> +int handshake_nl_done_doit(struct sk_buff *skb, struct genl_info *info)
>> +{
>> + struct net *net = sock_net(skb->sk);
>> + struct socket *sock = NULL;
>> + struct handshake_req *req;
>> + int fd, status, err;
>> +
>> + if (GENL_REQ_ATTR_CHECK(info, HANDSHAKE_A_DONE_SOCKFD))
>> + return -EINVAL;
>> + fd = nla_get_u32(info->attrs[HANDSHAKE_A_DONE_SOCKFD]);
>> +
>> + err = 0;
>> + sock = sockfd_lookup(fd, &err);
>> + if (err) {
>> + err = -EBADF;
>> + goto out_status;
>> + }
>> +
>> + req = handshake_req_hash_lookup(sock->sk);
>> + if (!req) {
> 
> fput() missing on this path?

fput of ... sock->file? DONE shouldn't do that if
it can't find the sock's matching handshake_req.


>> + err = -EBUSY;
>> + goto out_status;
>> + }
>> +
>> + trace_handshake_cmd_done(net, req, sock->sk, fd);
>> +
>> + status = -EIO;
>> + if (info->attrs[HANDSHAKE_A_DONE_STATUS])
>> + status = nla_get_u32(info->attrs[HANDSHAKE_A_DONE_STATUS]);
>> +
>> + handshake_complete(req, status, info);
>> + fput(sock->file);
>> + return 0;
>> +
>> +out_status:
>> + trace_handshake_cmd_done_err(net, req, sock->sk, err);
>> + return err;
>> +}
> 
>> + /*
>> + * Arbitrary limit to prevent handshakes that do not make
>> + * progress from clogging up the system.
>> + */
>> + si_meminfo(&si);
>> + tmp = si.totalram / (25 * si.mem_unit);
>> + hn->hn_pending_max = clamp(tmp, 3UL, 25UL);
> 
> No idea what this does (what's mem_unit?), we'll have to trust you :)

Paolo requested that we link the pending_max limit to
the memory size of the system. I thought folks would
be familiar with the si_meminfo() kernel API.

What does it need? A better comment? A different approach?


> And there are some kdoc issues here:
> 
> include/trace/events/handshake.h:112: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
> ** Request lifetime events
> include/trace/events/handshake.h:149: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst

Ah, that comment form was copied from
include/trace/events/rpcrdma.h. I didn't realize that would
be a source of noise. I'll replace it.

--
Chuck Lever



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v8 3/4] net/handshake: Add Kunit tests for the handshake consumer API
  2023-04-05  0:24   ` Jakub Kicinski
@ 2023-04-05  0:45     ` Chuck Lever III
  0 siblings, 0 replies; 16+ messages in thread
From: Chuck Lever III @ 2023-04-05  0:45 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Chuck Lever, pabeni, edumazet, borisp, netdev,
	kernel-tls-handshake, John Haxby



> On Apr 4, 2023, at 8:24 PM, Jakub Kicinski <kuba@kernel.org> wrote:
> 
> On Mon, 03 Apr 2023 14:46:15 -0400 Chuck Lever wrote:
>> From: Chuck Lever <chuck.lever@oracle.com>
>> 
>> These verify the API contracts and help exercise lifetime rules for
>> consumer sockets and handshake_req structures.
> 
> Does it build with allmodconfig for you?
> 
> error: the following would cause module name conflict:
>  lib/kunit/kunit-test.ko
>  net/handshake/kunit-test.ko

It doesn't. This was pointed out to me earlier, along with
another related build issue. I've fixed these in my private
tree for v9.


--
Chuck Lever



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v8 1/4] net/handshake: Create a NETLINK service for handling handshake requests
  2023-04-05  0:00       ` Jakub Kicinski
@ 2023-04-05  6:32         ` Hannes Reinecke
  2023-04-05 14:10           ` Chuck Lever III
  0 siblings, 1 reply; 16+ messages in thread
From: Hannes Reinecke @ 2023-04-05  6:32 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Chuck Lever III, Paolo Abeni, Eric Dumazet, Boris Pismenny,
	open list:NETWORKING [GENERAL],
	kernel-tls-handshake, John Haxby

On 4/5/23 02:00, Jakub Kicinski wrote:
> On Tue, 4 Apr 2023 17:44:19 +0200 Hannes Reinecke wrote:
>>> We're still seeing NULL pointer dereferences here.
>>> Typically this happens after the remote closes the
>>> connection early.
>>>
>>> I guess I cannot rely on sock_hold(sk); from preventing
>>> someone from doing a "sock->sk = NULL;"
>>>
>>> Would it make more sense for req_submit and req_cancel to
>>> operate on "struct sock *" rather than "struct socket *" ?
>>>    
>> Stumbled across that one, too; that's why my initial submission
>> was sprinkled with 'if (!sock->sk)' statements.
>> So I think it's a good idea.
>>
>> But waiting for Jakub to enlighten us.
> 
> Ah, I'm probably the weakest of the netdev maintainers when it comes
> to the socket layer :)
> 
> I thought sock->sk is only cleared if the "user" of the socket closes
> it. But yes, both sock->sk == NULL and sk->sk_socket == NULL are
> entirely possible, and the networking stack usually operates on
> struct sock. Why exactly those two are separate beings is one of
> the mysteries of Linux networking which causes guaranteed confusion
> to all the newcomers. I wish I knew the details so I could at least
> document it :S

Bummer. I had high hopes on you being able to shed some light on this.

So, Chuck: maybe we should be looking at switching over to 'struct sock' 
for the internal stuff. If we don't have to do a 'fput()' somewhere we 
should be good...

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v8 1/4] net/handshake: Create a NETLINK service for handling handshake requests
  2023-04-05  6:32         ` Hannes Reinecke
@ 2023-04-05 14:10           ` Chuck Lever III
  0 siblings, 0 replies; 16+ messages in thread
From: Chuck Lever III @ 2023-04-05 14:10 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Jakub Kicinski, Paolo Abeni, Eric Dumazet, Boris Pismenny,
	open list:NETWORKING [GENERAL],
	kernel-tls-handshake, John Haxby



> On Apr 5, 2023, at 2:32 AM, Hannes Reinecke <hare@suse.de> wrote:
> 
> On 4/5/23 02:00, Jakub Kicinski wrote:
>> On Tue, 4 Apr 2023 17:44:19 +0200 Hannes Reinecke wrote:
>>>> We're still seeing NULL pointer dereferences here.
>>>> Typically this happens after the remote closes the
>>>> connection early.
>>>> 
>>>> I guess I cannot rely on sock_hold(sk); from preventing
>>>> someone from doing a "sock->sk = NULL;"
>>>> 
>>>> Would it make more sense for req_submit and req_cancel to
>>>> operate on "struct sock *" rather than "struct socket *" ?
>>>>   
>>> Stumbled across that one, too; that's why my initial submission
>>> was sprinkled with 'if (!sock->sk)' statements.
>>> So I think it's a good idea.
>>> 
>>> But waiting for Jakub to enlighten us.
>> Ah, I'm probably the weakest of the netdev maintainers when it comes
>> to the socket layer :)
>> I thought sock->sk is only cleared if the "user" of the socket closes
>> it. But yes, both sock->sk == NULL and sk->sk_socket == NULL are
>> entirely possible, and the networking stack usually operates on
>> struct sock. Why exactly those two are separate beings is one of
>> the mysteries of Linux networking which causes guaranteed confusion
>> to all the newcomers. I wish I knew the details so I could at least
>> document it :S
> 
> Bummer. I had high hopes on you being able to shed some light on this.
> 
> So, Chuck: maybe we should be looking at switching over to 'struct sock' for the internal stuff. If we don't have to do a 'fput()' somewhere we should be good...

I've made handshake_req_cancel() take a "struct sock *" as
a starting point. I'll send out something for you to try
later today.


--
Chuck Lever



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v8 1/4] net/handshake: Create a NETLINK service for handling handshake requests
       [not found]       ` <20230404183233.78895cf2@kernel.org>
@ 2023-04-05 14:25         ` Chuck Lever III
  0 siblings, 0 replies; 16+ messages in thread
From: Chuck Lever III @ 2023-04-05 14:25 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Chuck Lever, Paolo Abeni, Eric Dumazet, Boris Pismenny,
	open list:NETWORKING [GENERAL],
	kernel-tls-handshake@lists.linux.dev"
	<kernel-tls-handshake@lists.linux.dev>,
	John Haxby <john.haxby@oracle.com>



> On Apr 4, 2023, at 9:32 PM, Jakub Kicinski <kuba@kernel.org> wrote:
> 
> On Wed, 5 Apr 2023 00:44:11 +0000 Chuck Lever III wrote:
>>> On Mon, 03 Apr 2023 14:46:02 -0400 Chuck Lever wrote:  
>>>> +int handshake_nl_done_doit(struct sk_buff *skb, struct genl_info *info)
>>>> +{
>>>> + struct net *net = sock_net(skb->sk);
>>>> + struct socket *sock = NULL;
>>>> + struct handshake_req *req;
>>>> + int fd, status, err;
>>>> +
>>>> + if (GENL_REQ_ATTR_CHECK(info, HANDSHAKE_A_DONE_SOCKFD))
>>>> + return -EINVAL;
>>>> + fd = nla_get_u32(info->attrs[HANDSHAKE_A_DONE_SOCKFD]);
>>>> +
>>>> + err = 0;
>>>> + sock = sockfd_lookup(fd, &err);
>>>> + if (err) {
>>>> + err = -EBADF;
>>>> + goto out_status;
>>>> + }
>>>> +
>>>> + req = handshake_req_hash_lookup(sock->sk);
>>>> + if (!req) {  
>>> 
>>> fput() missing on this path?  
>> 
>> fput of ... sock->file? DONE shouldn't do that if
>> it can't find the sock's matching handshake_req.
> 
> Hm, sounds odd. sockfd_lookup() returns a sock with incremented
> reference count, so if user space passes a random fd unrelated 
> to any real request - we'll bump the refcount of that fd and
> return  an error. So the related file is going to get leaked.
> What did I miss? 🤔️

Makes sense, got it. I'll add something here.


>>>> + err = -EBUSY;
>>>> + goto out_status;
>>>> + }
>>>> +
>>>> + trace_handshake_cmd_done(net, req, sock->sk, fd);
>>>> +
>>>> + status = -EIO;
>>>> + if (info->attrs[HANDSHAKE_A_DONE_STATUS])
>>>> + status = nla_get_u32(info->attrs[HANDSHAKE_A_DONE_STATUS]);
>>>> +
>>>> + handshake_complete(req, status, info);
>>>> + fput(sock->file);
>>>> + return 0;
>>>> +
>>>> +out_status:
>>>> + trace_handshake_cmd_done_err(net, req, sock->sk, err);
>>>> + return err;
>>>> +}  
>>> 
>>>> + /*
>>>> + * Arbitrary limit to prevent handshakes that do not make
>>>> + * progress from clogging up the system.
>>>> + */
>>>> + si_meminfo(&si);
>>>> + tmp = si.totalram / (25 * si.mem_unit);
>>>> + hn->hn_pending_max = clamp(tmp, 3UL, 25UL);  
>>> 
>>> No idea what this does (what's mem_unit?), we'll have to trust you :)  
>> 
>> Paolo requested that we link the pending_max limit to
>> the memory size of the system. I thought folks would
>> be familiar with the si_meminfo() kernel API.
>> 
>> What does it need? A better comment? A different approach?
> 
> I think comment would be good. Are you clamping to 1/25th of all
> available pages? The request take up about a page?

As the sign sez, it's an arbitrary cap. I just picked a
formula that gave about 10 pending entries on my tiny
test system.

As always, suggestions are welcome.

--
Chuck Lever



^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2023-04-05 14:26 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-03 18:45 [PATCH v8 0/4] Another crack at a handshake upcall mechanism Chuck Lever
2023-04-03 18:46 ` [PATCH v8 1/4] net/handshake: Create a NETLINK service for handling handshake requests Chuck Lever
2023-04-04 15:36   ` Chuck Lever III
2023-04-04 15:44     ` Hannes Reinecke
2023-04-05  0:00       ` Jakub Kicinski
2023-04-05  6:32         ` Hannes Reinecke
2023-04-05 14:10           ` Chuck Lever III
2023-04-05  0:23   ` Jakub Kicinski
2023-04-05  0:44     ` Chuck Lever III
     [not found]       ` <20230404183233.78895cf2@kernel.org>
2023-04-05 14:25         ` Chuck Lever III
2023-04-03 18:46 ` [PATCH v8 2/4] net/handshake: Add a kernel API for requesting a TLSv1.3 handshake Chuck Lever
2023-04-03 18:46 ` [PATCH v8 3/4] net/handshake: Add Kunit tests for the handshake consumer API Chuck Lever
2023-04-05  0:24   ` Jakub Kicinski
2023-04-05  0:45     ` Chuck Lever III
2023-04-03 18:46 ` [PATCH v8 4/4] SUNRPC: Recognize control messages in server-side TCP socket code Chuck Lever
2023-04-04 14:41   ` Chuck Lever

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.