All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chuck Lever III <chuck.lever@oracle.com>
To: Jeff Layton <jlayton@redhat.com>
Cc: Chuck Lever <cel@kernel.org>, Jakub Kicinski <kuba@kernel.org>,
	Paolo Abeni <pabeni@redhat.com>,
	Eric Dumazet <edumazet@google.com>,
	"open list:NETWORKING [GENERAL]" <netdev@vger.kernel.org>,
	"kernel-tls-handshake@lists.linux.dev"
	<kernel-tls-handshake@lists.linux.dev>,
	John Haxby <john.haxby@oracle.com>
Subject: Re: [PATCH v7 1/2] net/handshake: Create a NETLINK service for handling handshake requests
Date: Tue, 28 Mar 2023 18:19:22 +0000	[thread overview]
Message-ID: <8154E3F3-C8A2-4BCF-8DFC-E00EA7B9CF80@oracle.com> (raw)
In-Reply-To: <3e4e33c19a9c608be863d2d7207f5a9cb7db795f.camel@redhat.com>



> On Mar 28, 2023, at 2:14 PM, Jeff Layton <jlayton@redhat.com> wrote:
> 
> On Sat, 2023-03-18 at 12:18 -0400, Chuck Lever wrote:
>> From: Chuck Lever <chuck.lever@oracle.com>
>> 
>> When a kernel consumer needs a transport layer security session, it
>> first needs a handshake to negotiate and establish a session. This
>> negotiation can be done in user space via one of the several
>> existing library implementations, or it can be done in the kernel.
>> 
>> No in-kernel handshake implementations yet exist. In their absence,
>> we add a netlink service that can:
>> 
>> a. Notify a user space daemon that a handshake is needed.
>> 
>> b. Once notified, the daemon calls the kernel back via this
>>   netlink service to get the handshake parameters, including an
>>   open socket on which to establish the session.
>> 
>> c. Once the handshake is complete, the daemon reports the
>>   session status and other information via a second netlink
>>   operation. This operation marks that it is safe for the
>>   kernel to use the open socket and the security session
>>   established there.
>> 
>> The notification service uses a multicast group. Each handshake
>> mechanism (eg, tlshd) adopts its own group number so that the
>> handshake services are completely independent of one another. The
>> kernel can then tell via netlink_has_listeners() whether a handshake
>> service is active and prepared to handle a handshake request.
>> 
>> A new netlink operation, ACCEPT, acts like accept(2) in that it
>> instantiates a file descriptor in the user space daemon's fd table.
>> If this operation is successful, the reply carries the fd number,
>> which can be treated as an open and ready file descriptor.
>> 
>> While user space is performing the handshake, the kernel keeps its
>> muddy paws off the open socket. A second new netlink operation,
>> DONE, indicates that the user space daemon is finished with the
>> socket and it is safe for the kernel to use again. The operation
>> also indicates whether a session was established successfully.
>> 
>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>> ---
>> Documentation/netlink/specs/handshake.yaml |  122 +++++++++++
>> MAINTAINERS                                |    8 +
>> include/trace/events/handshake.h           |  159 ++++++++++++++
>> include/uapi/linux/handshake.h             |   70 ++++++
>> net/Kconfig                                |    5 
>> net/Makefile                               |    1 
>> net/handshake/Makefile                     |   11 +
>> net/handshake/genl.c                       |   57 +++++
>> net/handshake/genl.h                       |   23 ++
>> net/handshake/handshake.h                  |   82 +++++++
>> net/handshake/netlink.c                    |  316 ++++++++++++++++++++++++++++
>> net/handshake/request.c                    |  307 +++++++++++++++++++++++++++
>> net/handshake/trace.c                      |   20 ++
>> 13 files changed, 1181 insertions(+)
>> create mode 100644 Documentation/netlink/specs/handshake.yaml
>> create mode 100644 include/trace/events/handshake.h
>> create mode 100644 include/uapi/linux/handshake.h
>> create mode 100644 net/handshake/Makefile
>> create mode 100644 net/handshake/genl.c
>> create mode 100644 net/handshake/genl.h
>> create mode 100644 net/handshake/handshake.h
>> create mode 100644 net/handshake/netlink.c
>> create mode 100644 net/handshake/request.c
>> create mode 100644 net/handshake/trace.c
>> 
>> 
> 
> [...]
> 
>> diff --git a/net/handshake/request.c b/net/handshake/request.c
>> new file mode 100644
>> index 000000000000..3f8ae9e990d2
>> --- /dev/null
>> +++ b/net/handshake/request.c
>> @@ -0,0 +1,307 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Handshake request lifetime events
>> + *
>> + * Author: Chuck Lever <chuck.lever@oracle.com>
>> + *
>> + * Copyright (c) 2023, Oracle and/or its affiliates.
>> + */
>> +
>> +#include <linux/types.h>
>> +#include <linux/socket.h>
>> +#include <linux/kernel.h>
>> +#include <linux/module.h>
>> +#include <linux/skbuff.h>
>> +#include <linux/inet.h>
>> +#include <linux/fdtable.h>
>> +#include <linux/rhashtable.h>
>> +
>> +#include <net/sock.h>
>> +#include <net/genetlink.h>
>> +#include <net/netns/generic.h>
>> +
>> +#include <uapi/linux/handshake.h>
>> +#include "handshake.h"
>> +
>> +#include <trace/events/handshake.h>
>> +
>> +/*
>> + * We need both a handshake_req -> sock mapping, and a sock ->
>> + * handshake_req mapping. Both are one-to-one.
>> + *
>> + * To avoid adding another pointer field to struct sock, net/handshake
>> + * maintains a hash table, indexed by the memory address of @sock, to
>> + * find the struct handshake_req outstanding for that socket. The
>> + * reverse direction uses a simple pointer field in the handshake_req
>> + * struct.
>> + */
>> +
>> +static struct rhashtable handshake_rhashtbl ____cacheline_aligned_in_smp;
>> +
>> +static const struct rhashtable_params handshake_rhash_params = {
>> +	.key_len		= sizeof_field(struct handshake_req, hr_sk),
>> +	.key_offset		= offsetof(struct handshake_req, hr_sk),
>> +	.head_offset		= offsetof(struct handshake_req, hr_rhash),
>> +	.automatic_shrinking	= true,
>> +};
>> +
>> +int handshake_req_hash_init(void)
>> +{
>> +	return rhashtable_init(&handshake_rhashtbl, &handshake_rhash_params);
>> +}
>> +
>> +void handshake_req_hash_destroy(void)
>> +{
>> +	rhashtable_destroy(&handshake_rhashtbl);
>> +}
>> +
>> +struct handshake_req *handshake_req_hash_lookup(struct sock *sk)
>> +{
>> +	return rhashtable_lookup_fast(&handshake_rhashtbl, &sk,
> 
> Is this correct? It seems like we should be searching for the struct
> sock pointer value, not on the pointer to the pointer (which will be a
> stack var), right?

I copied this from the nfsd_file and nfs4_file code we added recently.
rhashtable_lookup_fast takes a pointer to the key, so a pointer to a
pointer should be correct in this case.


>> +				      handshake_rhash_params);
>> +}
>> +
>> +static noinline bool handshake_req_hash_add(struct handshake_req *req)
>> +{
>> +	int ret;
>> +
>> +	ret = rhashtable_lookup_insert_fast(&handshake_rhashtbl,
>> +					    &req->hr_rhash,
>> +					    handshake_rhash_params);
>> +	return ret == 0;
>> +}
>> +
>> +static noinline void handshake_req_destroy(struct handshake_req *req)
>> +{
>> +	if (req->hr_proto->hp_destroy)
>> +		req->hr_proto->hp_destroy(req);
>> +	rhashtable_remove_fast(&handshake_rhashtbl, &req->hr_rhash,
>> +			       handshake_rhash_params);
>> +	kfree(req);
>> +}
>> +
>> +static void handshake_sk_destruct(struct sock *sk)
>> +{
>> +	void (*sk_destruct)(struct sock *sk);
>> +	struct handshake_req *req;
>> +
>> +	req = handshake_req_hash_lookup(sk);
>> +	if (!req)
>> +		return;
>> +
>> +	trace_handshake_destruct(sock_net(sk), req, sk);
>> +	sk_destruct = req->hr_odestruct;
>> +	handshake_req_destroy(req);
>> +	if (sk_destruct)
>> +		sk_destruct(sk);
>> +}
>> +
>> +/**
>> + * handshake_req_alloc - consumer API to allocate a request
>> + * @sock: open socket on which to perform a handshake
>> + * @proto: security protocol
>> + * @flags: memory allocation flags
>> + *
>> + * Returns an initialized handshake_req or NULL.
>> + */
>> +struct handshake_req *handshake_req_alloc(struct socket *sock,
>> +					  const struct handshake_proto *proto,
>> +					  gfp_t flags)
>> +{
>> +	struct sock *sk = sock->sk;
>> +	struct net *net = sock_net(sk);
>> +	struct handshake_net *hn = handshake_pernet(net);
>> +	struct handshake_req *req;
>> +
>> +	if (!hn)
>> +		return NULL;
>> +
>> +	req = kzalloc(struct_size(req, hr_priv, proto->hp_privsize), flags);
>> +	if (!req)
>> +		return NULL;
>> +
>> +	sock_hold(sk);
>> +
>> +	INIT_LIST_HEAD(&req->hr_list);
>> +	req->hr_sk = sk;
>> +	req->hr_proto = proto;
>> +	return req;
>> +}
>> +EXPORT_SYMBOL(handshake_req_alloc);
>> +
>> +/**
>> + * handshake_req_private - consumer API to return per-handshake private data
>> + * @req: handshake arguments
>> + *
>> + */
>> +void *handshake_req_private(struct handshake_req *req)
>> +{
>> +	return (void *)&req->hr_priv;
>> +}
>> +EXPORT_SYMBOL(handshake_req_private);
>> +
>> +static bool __add_pending_locked(struct handshake_net *hn,
>> +				 struct handshake_req *req)
>> +{
>> +	if (!list_empty(&req->hr_list))
>> +		return false;
>> +	hn->hn_pending++;
>> +	list_add_tail(&req->hr_list, &hn->hn_requests);
>> +	return true;
>> +}
>> +
>> +void __remove_pending_locked(struct handshake_net *hn,
>> +			     struct handshake_req *req)
>> +{
>> +	hn->hn_pending--;
>> +	list_del_init(&req->hr_list);
>> +}
>> +
>> +/*
>> + * Returns %true if the request was found on @net's pending list,
>> + * otherwise %false.
>> + *
>> + * If @req was on a pending list, it has not yet been accepted.
>> + */
>> +static bool remove_pending(struct handshake_net *hn, struct handshake_req *req)
>> +{
>> +	bool ret;
>> +
>> +	ret = false;
>> +
>> +	spin_lock(&hn->hn_lock);
>> +	if (!list_empty(&req->hr_list)) {
>> +		__remove_pending_locked(hn, req);
>> +		ret = true;
>> +	}
>> +	spin_unlock(&hn->hn_lock);
>> +
>> +	return ret;
>> +}
>> +
>> +/**
>> + * handshake_req_submit - consumer API to submit a handshake request
>> + * @req: handshake arguments
>> + * @flags: memory allocation flags
>> + *
>> + * Return values:
>> + *   %0: Request queued
>> + *   %-EBUSY: A handshake is already under way for this socket
>> + *   %-ESRCH: No handshake agent is available
>> + *   %-EAGAIN: Too many pending handshake requests
>> + *   %-ENOMEM: Failed to allocate memory
>> + *   %-EMSGSIZE: Failed to construct notification message
>> + *   %-EOPNOTSUPP: Handshake module not initialized
>> + *
>> + * A zero return value from handshake_request() means that
>> + * exactly one subsequent completion callback is guaranteed.
>> + *
>> + * A negative return value from handshake_request() means that
>> + * no completion callback will be done and that @req has been
>> + * destroyed.
>> + */
>> +int handshake_req_submit(struct handshake_req *req, gfp_t flags)
>> +{
>> +	struct sock *sk = req->hr_sk;
>> +	struct net *net = sock_net(sk);
>> +	struct handshake_net *hn = handshake_pernet(net);
>> +	int ret;
>> +
>> +	if (!hn)
>> +		return -EOPNOTSUPP;
>> +
>> +	ret = -EAGAIN;
>> +	if (READ_ONCE(hn->hn_pending) >= hn->hn_pending_max)
>> +		goto out_err;
>> +
>> +	req->hr_odestruct = sk->sk_destruct;
>> +	sk->sk_destruct = handshake_sk_destruct;
>> +	spin_lock(&hn->hn_lock);
>> +	ret = -EOPNOTSUPP;
>> +	if (test_bit(HANDSHAKE_F_NET_DRAINING, &hn->hn_flags))
>> +		goto out_unlock;
>> +	ret = -EBUSY;
>> +	if (!handshake_req_hash_add(req))
>> +		goto out_unlock;
>> +	if (!__add_pending_locked(hn, req))
>> +		goto out_unlock;
>> +	spin_unlock(&hn->hn_lock);
>> +
>> +	ret = handshake_genl_notify(net, req->hr_proto->hp_handler_class,
>> +				    flags);
>> +	if (ret) {
>> +		trace_handshake_notify_err(net, req, sk, ret);
>> +		if (remove_pending(hn, req))
>> +			goto out_err;
>> +	}
>> +
>> +	trace_handshake_submit(net, req, sk);
>> +	return 0;
>> +
>> +out_unlock:
>> +	spin_unlock(&hn->hn_lock);
>> +out_err:
>> +	trace_handshake_submit_err(net, req, sk, ret);
>> +	handshake_req_destroy(req);
>> +	return ret;
>> +}
>> +EXPORT_SYMBOL(handshake_req_submit);
>> +
>> +void handshake_complete(struct handshake_req *req, unsigned int status,
>> +			struct genl_info *info)
>> +{
>> +	struct sock *sk = req->hr_sk;
>> +	struct net *net = sock_net(sk);
>> +
>> +	if (!test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED, &req->hr_flags)) {
>> +		trace_handshake_complete(net, req, sk, status);
>> +		req->hr_proto->hp_done(req, status, info);
>> +		__sock_put(sk);
>> +	}
>> +}
>> +
>> +/**
>> + * handshake_req_cancel - consumer API to cancel an in-progress handshake
>> + * @sock: socket on which there is an ongoing handshake
>> + *
>> + * XXX: Perhaps killing the user space agent might also be necessary?
>> + *
>> + * Request cancellation races with request completion. To determine
>> + * who won, callers examine the return value from this function.
>> + *
>> + * Return values:
>> + *   %true - Uncompleted handshake request was canceled or not found
>> + *   %false - Handshake request already completed
>> + */
>> +bool handshake_req_cancel(struct socket *sock)
>> +{
>> +	struct handshake_req *req;
>> +	struct handshake_net *hn;
>> +	struct sock *sk;
>> +	struct net *net;
>> +
>> +	sk = sock->sk;
>> +	net = sock_net(sk);
>> +	req = handshake_req_hash_lookup(sk);
>> +	if (!req) {
>> +		trace_handshake_cancel_none(net, req, sk);
>> +		return true;
>> +	}
>> +
>> +	hn = handshake_pernet(net);
>> +	if (hn && remove_pending(hn, req)) {
>> +		/* Request hadn't been accepted */
>> +		trace_handshake_cancel(net, req, sk);
>> +		return true;
>> +	}
>> +	if (test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED, &req->hr_flags)) {
>> +		/* Request already completed */
>> +		trace_handshake_cancel_busy(net, req, sk);
>> +		return false;
>> +	}
>> +
>> +	__sock_put(sk);
>> +	trace_handshake_cancel(net, req, sk);
>> +	return true;
>> +}
>> +EXPORT_SYMBOL(handshake_req_cancel);
>> diff --git a/net/handshake/trace.c b/net/handshake/trace.c
>> new file mode 100644
>> index 000000000000..1c4d8e27e17a
>> --- /dev/null
>> +++ b/net/handshake/trace.c
>> @@ -0,0 +1,20 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Trace points for transport security layer handshakes.
>> + *
>> + * Author: Chuck Lever <chuck.lever@oracle.com>
>> + *
>> + * Copyright (c) 2023, Oracle and/or its affiliates.
>> + */
>> +
>> +#include <linux/types.h>
>> +
>> +#include <net/sock.h>
>> +#include <net/netlink.h>
>> +#include <net/genetlink.h>
>> +
>> +#include "handshake.h"
>> +
>> +#define CREATE_TRACE_POINTS
>> +
>> +#include <trace/events/handshake.h>
>> 
>> 
>> 
> 
> -- 
> Jeff Layton <jlayton@redhat.com>

--
Chuck Lever



  reply	other threads:[~2023-03-28 18:19 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-18 16:18 [PATCH v7 0/2] Another crack at a handshake upcall mechanism Chuck Lever
2023-03-18 16:18 ` [PATCH v7 1/2] net/handshake: Create a NETLINK service for handling handshake requests Chuck Lever
2023-03-20  6:49   ` Hannes Reinecke
2023-03-21 11:27   ` Paolo Abeni
2023-03-21 13:58     ` Chuck Lever III
2023-03-22  9:03       ` Paolo Abeni
2023-03-22 13:35         ` Chuck Lever III
2023-03-22 16:32           ` Chuck Lever III
2023-03-21 19:55     ` Fwd: " Chuck Lever III
2023-03-22  9:06   ` Paolo Abeni
2023-03-28 18:14   ` Jeff Layton
2023-03-28 18:19     ` Chuck Lever III [this message]
2023-03-28 18:32       ` Jeff Layton
2023-03-18 16:18 ` [PATCH v7 2/2] net/tls: Add kernel APIs for requesting a TLSv1.3 handshake Chuck Lever
2023-03-20  6:53   ` Hannes Reinecke
2023-03-18 16:26 ` [PATCH v7 0/2] Another crack at a handshake upcall mechanism Chuck Lever III

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8154E3F3-C8A2-4BCF-8DFC-E00EA7B9CF80@oracle.com \
    --to=chuck.lever@oracle.com \
    --cc=cel@kernel.org \
    --cc=edumazet@google.com \
    --cc=jlayton@redhat.com \
    --cc=john.haxby@oracle.com \
    --cc=kernel-tls-handshake@lists.linux.dev \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.