linux-kselftest.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v2 0/5] tls: implement key updates for TLS1.3
@ 2023-02-14 11:17 Sabrina Dubroca
  2023-02-14 11:17 ` [PATCH net-next v2 1/5] tls: remove tls_context argument from tls_set_sw_offload Sabrina Dubroca
                   ` (5 more replies)
  0 siblings, 6 replies; 21+ messages in thread
From: Sabrina Dubroca @ 2023-02-14 11:17 UTC (permalink / raw)
  To: netdev
  Cc: Sabrina Dubroca, Vadim Fedorenko, Frantisek Krenzelok,
	Jakub Kicinski, Kuniyuki Iwashima, Apoorv Kothari,
	Boris Pismenny, John Fastabend, Shuah Khan, linux-kselftest,
	Gal Pressman, Marcel Holtmann

This adds support for receiving KeyUpdate messages (RFC 8446, 4.6.3
[1]). A sender transmits a KeyUpdate message and then changes its TX
key. The receiver should react by updating its RX key before
processing the next message.

This patchset implements key updates by:
 1. pausing decryption when a KeyUpdate message is received, to avoid
    attempting to use the old key to decrypt a record encrypted with
    the new key
 2. returning -EKEYEXPIRED to syscalls that cannot receive the
    KeyUpdate message, until the rekey has been performed by userspace
 3. passing the KeyUpdate message to userspace as a control message
 4. allowing updates of the crypto_info via the TLS_TX/TLS_RX
    setsockopts

This API has been tested with gnutls to make sure that it allows
userspace libraries to implement key updates [2]. Thanks to Frantisek
Krenzelok <fkrenzel@redhat.com> for providing the implementation in
gnutls and testing the kernel patches.

Note: in a future series, I'll clean up tls_set_sw_offload and
eliminate the per-cipher copy-paste using tls_cipher_size_desc.

[1] https://www.rfc-editor.org/rfc/rfc8446#section-4.6.3
[2] https://gitlab.com/gnutls/gnutls/-/merge_requests/1625


Changes in v2
use reverse xmas tree ordering in tls_set_sw_offload and
do_tls_setsockopt_conf
turn the alt_crypto_info into an else if
selftests: add rekey_fail test

Vadim suggested simplifying tls_set_sw_offload by copying the new
crypto_info in the context in do_tls_setsockopt_conf, and then
detecting the rekey in tls_set_sw_offload based on whether the iv was
already set, but I don't think we can have a common error path
(otherwise we'd free the aead etc on rekey failure). I decided instead
to reorganize tls_set_sw_offload so that the context is unmodified
until we know the rekey cannot fail. Some fields will be touched
during the rekey, but they will be set to the same value they had
before the rekey (prot->rec_seq_size, etc).

Apoorv suggested to name the struct tls_crypto_info_keys "tls13"
rather than "tls12". Since we're using the same crypto_info data for
TLS1.3 as for 1.2, even if the tests only run for TLS1.3, I'd rather
keep the "tls12" name, in case we end up adding a
"tls13_crypto_info_aes_gcm_128" type in the future.

Kuniyuki and Apoorv also suggested preventing rekeys on RX when we
haven't received a matching KeyUpdate message, but I'd rather let
userspace handle this and have a symmetric API between TX and RX on
the kernel side. It's a bit of a foot-gun, but we can't really stop a
broken userspace from rolling back the rec_seq on an existing
crypto_info either, and that seems like a worse possible breakage.

Sabrina Dubroca (5):
  tls: remove tls_context argument from tls_set_sw_offload
  tls: block decryption when a rekey is pending
  tls: implement rekey for TLS1.3
  selftests: tls: add key_generation argument to tls_crypto_info_init
  selftests: tls: add rekey tests

 include/net/tls.h                 |   4 +
 net/tls/tls.h                     |   3 +-
 net/tls/tls_device.c              |   2 +-
 net/tls/tls_main.c                |  37 +++-
 net/tls/tls_sw.c                  | 189 +++++++++++++----
 tools/testing/selftests/net/tls.c | 336 +++++++++++++++++++++++++++++-
 6 files changed, 511 insertions(+), 60 deletions(-)

-- 
2.38.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH net-next v2 1/5] tls: remove tls_context argument from tls_set_sw_offload
  2023-02-14 11:17 [PATCH net-next v2 0/5] tls: implement key updates for TLS1.3 Sabrina Dubroca
@ 2023-02-14 11:17 ` Sabrina Dubroca
  2023-02-14 11:17 ` [PATCH net-next v2 2/5] tls: block decryption when a rekey is pending Sabrina Dubroca
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 21+ messages in thread
From: Sabrina Dubroca @ 2023-02-14 11:17 UTC (permalink / raw)
  To: netdev
  Cc: Sabrina Dubroca, Vadim Fedorenko, Frantisek Krenzelok,
	Jakub Kicinski, Kuniyuki Iwashima, Apoorv Kothari,
	Boris Pismenny, John Fastabend, Shuah Khan, linux-kselftest,
	Gal Pressman, Marcel Holtmann

It's not really needed since we end up refetching it as tls_ctx. We
can also remove the NULL check, since we have already dereferenced ctx
in do_tls_setsockopt_conf.

v2: reverse xmas tree

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
 net/tls/tls.h        |  2 +-
 net/tls/tls_device.c |  2 +-
 net/tls/tls_main.c   |  4 ++--
 net/tls/tls_sw.c     | 17 +++++++----------
 4 files changed, 11 insertions(+), 14 deletions(-)

diff --git a/net/tls/tls.h b/net/tls/tls.h
index 0e840a0c3437..34d0fe814600 100644
--- a/net/tls/tls.h
+++ b/net/tls/tls.h
@@ -90,7 +90,7 @@ int tls_sk_attach(struct sock *sk, int optname, char __user *optval,
 		  unsigned int optlen);
 void tls_err_abort(struct sock *sk, int err);
 
-int tls_set_sw_offload(struct sock *sk, struct tls_context *ctx, int tx);
+int tls_set_sw_offload(struct sock *sk, int tx);
 void tls_update_rx_zc_capable(struct tls_context *tls_ctx);
 void tls_sw_strparser_arm(struct sock *sk, struct tls_context *ctx);
 void tls_sw_strparser_done(struct tls_context *tls_ctx);
diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
index 6c593788dc25..c149f36b42ee 100644
--- a/net/tls/tls_device.c
+++ b/net/tls/tls_device.c
@@ -1291,7 +1291,7 @@ int tls_set_device_offload_rx(struct sock *sk, struct tls_context *ctx)
 	context->resync_nh_reset = 1;
 
 	ctx->priv_ctx_rx = context;
-	rc = tls_set_sw_offload(sk, ctx, 0);
+	rc = tls_set_sw_offload(sk, 0);
 	if (rc)
 		goto release_ctx;
 
diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index 3735cb00905d..fb1da1780f50 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -772,7 +772,7 @@ static int do_tls_setsockopt_conf(struct sock *sk, sockptr_t optval,
 			TLS_INC_STATS(sock_net(sk), LINUX_MIB_TLSTXDEVICE);
 			TLS_INC_STATS(sock_net(sk), LINUX_MIB_TLSCURRTXDEVICE);
 		} else {
-			rc = tls_set_sw_offload(sk, ctx, 1);
+			rc = tls_set_sw_offload(sk, 1);
 			if (rc)
 				goto err_crypto_info;
 			TLS_INC_STATS(sock_net(sk), LINUX_MIB_TLSTXSW);
@@ -786,7 +786,7 @@ static int do_tls_setsockopt_conf(struct sock *sk, sockptr_t optval,
 			TLS_INC_STATS(sock_net(sk), LINUX_MIB_TLSRXDEVICE);
 			TLS_INC_STATS(sock_net(sk), LINUX_MIB_TLSCURRRXDEVICE);
 		} else {
-			rc = tls_set_sw_offload(sk, ctx, 0);
+			rc = tls_set_sw_offload(sk, 0);
 			if (rc)
 				goto err_crypto_info;
 			TLS_INC_STATS(sock_net(sk), LINUX_MIB_TLSRXSW);
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 6d0a534b7baa..238bd18c5eb6 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -2465,25 +2465,22 @@ void tls_update_rx_zc_capable(struct tls_context *tls_ctx)
 		tls_ctx->prot_info.version != TLS_1_3_VERSION;
 }
 
-int tls_set_sw_offload(struct sock *sk, struct tls_context *ctx, int tx)
+int tls_set_sw_offload(struct sock *sk, int tx)
 {
-	struct tls_context *tls_ctx = tls_get_ctx(sk);
-	struct tls_prot_info *prot = &tls_ctx->prot_info;
-	struct tls_crypto_info *crypto_info;
+	u16 nonce_size, tag_size, iv_size, rec_seq_size, salt_size;
+	char *iv, *rec_seq, *key, *salt, *cipher_name;
 	struct tls_sw_context_tx *sw_ctx_tx = NULL;
 	struct tls_sw_context_rx *sw_ctx_rx = NULL;
+	struct tls_context *ctx = tls_get_ctx(sk);
+	struct tls_crypto_info *crypto_info;
 	struct cipher_context *cctx;
+	struct tls_prot_info *prot;
 	struct crypto_aead **aead;
-	u16 nonce_size, tag_size, iv_size, rec_seq_size, salt_size;
 	struct crypto_tfm *tfm;
-	char *iv, *rec_seq, *key, *salt, *cipher_name;
 	size_t keysize;
 	int rc = 0;
 
-	if (!ctx) {
-		rc = -EINVAL;
-		goto out;
-	}
+	prot = &ctx->prot_info;
 
 	if (tx) {
 		if (!ctx->priv_ctx_tx) {
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH net-next v2 2/5] tls: block decryption when a rekey is pending
  2023-02-14 11:17 [PATCH net-next v2 0/5] tls: implement key updates for TLS1.3 Sabrina Dubroca
  2023-02-14 11:17 ` [PATCH net-next v2 1/5] tls: remove tls_context argument from tls_set_sw_offload Sabrina Dubroca
@ 2023-02-14 11:17 ` Sabrina Dubroca
  2023-02-15  5:09   ` Jakub Kicinski
  2023-02-14 11:17 ` [PATCH net-next v2 3/5] tls: implement rekey for TLS1.3 Sabrina Dubroca
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Sabrina Dubroca @ 2023-02-14 11:17 UTC (permalink / raw)
  To: netdev
  Cc: Sabrina Dubroca, Vadim Fedorenko, Frantisek Krenzelok,
	Jakub Kicinski, Kuniyuki Iwashima, Apoorv Kothari,
	Boris Pismenny, John Fastabend, Shuah Khan, linux-kselftest,
	Gal Pressman, Marcel Holtmann

When a TLS handshake record carrying a KeyUpdate message is received,
all subsequent records will be encrypted with a new key. We need to
stop decrypting incoming records with the old key, and wait until
userspace provides a new key.

Make a note of this in the RX context just after decrypting that
record, and stop recvmsg/splice calls with EKEYEXPIRED until the new
key is available.

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
 include/net/tls.h |  4 ++++
 net/tls/tls_sw.c  | 44 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 48 insertions(+)

diff --git a/include/net/tls.h b/include/net/tls.h
index 154949c7b0c8..297732f23804 100644
--- a/include/net/tls.h
+++ b/include/net/tls.h
@@ -69,8 +69,11 @@ extern const struct tls_cipher_size_desc tls_cipher_size_desc[];
 
 #define TLS_CRYPTO_INFO_READY(info)	((info)->cipher_type)
 
+#define TLS_RECORD_TYPE_HANDSHAKE	0x16
 #define TLS_RECORD_TYPE_DATA		0x17
 
+#define TLS_HANDSHAKE_KEYUPDATE		24	/* rfc8446 B.3: Key update */
+
 #define TLS_AAD_SPACE_SIZE		13
 
 #define MAX_IV_SIZE			16
@@ -145,6 +148,7 @@ struct tls_sw_context_rx {
 
 	struct tls_strparser strp;
 
+	bool key_update_pending;
 	atomic_t decrypt_pending;
 	/* protect crypto_wait with decrypt_pending*/
 	spinlock_t decrypt_compl_lock;
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 238bd18c5eb6..149a39d9a56a 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -1687,6 +1687,33 @@ tls_decrypt_device(struct sock *sk, struct msghdr *msg,
 	return 1;
 }
 
+static int tls_check_pending_rekey(struct sock *sk, struct sk_buff *skb)
+{
+	const struct tls_msg *tlm = tls_msg(skb);
+	const struct strp_msg *rxm = strp_msg(skb);
+
+	if (tlm->control == TLS_RECORD_TYPE_HANDSHAKE) {
+		char hs_type;
+		int err;
+
+		if (rxm->full_len < 1)
+			return -EINVAL;
+
+		err = skb_copy_bits(skb, rxm->offset, &hs_type, 1);
+		if (err < 0)
+			return err;
+
+		if (hs_type == TLS_HANDSHAKE_KEYUPDATE) {
+			struct tls_context *ctx = tls_get_ctx(sk);
+			struct tls_sw_context_rx *rx_ctx = ctx->priv_ctx_rx;
+
+			rx_ctx->key_update_pending = true;
+		}
+	}
+
+	return 0;
+}
+
 static int tls_rx_one_record(struct sock *sk, struct msghdr *msg,
 			     struct tls_decrypt_arg *darg)
 {
@@ -1706,6 +1733,10 @@ static int tls_rx_one_record(struct sock *sk, struct msghdr *msg,
 	rxm->full_len -= prot->overhead_size;
 	tls_advance_record_sn(sk, prot, &tls_ctx->rx);
 
+	err = tls_check_pending_rekey(sk, darg->skb);
+	if (err < 0)
+		return err;
+
 	return 0;
 }
 
@@ -1957,6 +1988,12 @@ int tls_sw_recvmsg(struct sock *sk,
 		struct tls_decrypt_arg darg;
 		int to_decrypt, chunk;
 
+		/* a rekey is pending, let userspace deal with it */
+		if (unlikely(ctx->key_update_pending)) {
+			err = -EKEYEXPIRED;
+			break;
+		}
+
 		err = tls_rx_rec_wait(sk, psock, flags & MSG_DONTWAIT,
 				      released);
 		if (err <= 0) {
@@ -2141,6 +2178,12 @@ ssize_t tls_sw_splice_read(struct socket *sock,  loff_t *ppos,
 	if (err < 0)
 		return err;
 
+	/* a rekey is pending, let userspace deal with it */
+	if (unlikely(ctx->key_update_pending)) {
+		err = -EKEYEXPIRED;
+		goto splice_read_end;
+	}
+
 	if (!skb_queue_empty(&ctx->rx_list)) {
 		skb = __skb_dequeue(&ctx->rx_list);
 	} else {
@@ -2526,6 +2569,7 @@ int tls_set_sw_offload(struct sock *sk, int tx)
 		skb_queue_head_init(&sw_ctx_rx->rx_list);
 		skb_queue_head_init(&sw_ctx_rx->async_hold);
 		aead = &sw_ctx_rx->aead_recv;
+		sw_ctx_rx->key_update_pending = false;
 	}
 
 	switch (crypto_info->cipher_type) {
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH net-next v2 3/5] tls: implement rekey for TLS1.3
  2023-02-14 11:17 [PATCH net-next v2 0/5] tls: implement key updates for TLS1.3 Sabrina Dubroca
  2023-02-14 11:17 ` [PATCH net-next v2 1/5] tls: remove tls_context argument from tls_set_sw_offload Sabrina Dubroca
  2023-02-14 11:17 ` [PATCH net-next v2 2/5] tls: block decryption when a rekey is pending Sabrina Dubroca
@ 2023-02-14 11:17 ` Sabrina Dubroca
  2023-02-14 11:17 ` [PATCH net-next v2 4/5] selftests: tls: add key_generation argument to tls_crypto_info_init Sabrina Dubroca
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 21+ messages in thread
From: Sabrina Dubroca @ 2023-02-14 11:17 UTC (permalink / raw)
  To: netdev
  Cc: Sabrina Dubroca, Vadim Fedorenko, Frantisek Krenzelok,
	Jakub Kicinski, Kuniyuki Iwashima, Apoorv Kothari,
	Boris Pismenny, John Fastabend, Shuah Khan, linux-kselftest,
	Gal Pressman, Marcel Holtmann

This adds the possibility to change the key and IV when using
TLS1.3. Changing the cipher or TLS version is not supported.

Once we have updated the RX key, we can unblock the receive side. If
the rekey fails, the context is unmodified and userspace is free to
retry the update or close the socket.

This change only affects tls_sw, since 1.3 offload isn't supported.

v2: reverse xmas tree
    turn the alt_crypto_info into an else if
    don't modify the context when rekey fails

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
 net/tls/tls.h        |   3 +-
 net/tls/tls_device.c |   2 +-
 net/tls/tls_main.c   |  37 +++++++++---
 net/tls/tls_sw.c     | 134 +++++++++++++++++++++++++++++++------------
 4 files changed, 129 insertions(+), 47 deletions(-)

diff --git a/net/tls/tls.h b/net/tls/tls.h
index 34d0fe814600..6f9c85eaa9c5 100644
--- a/net/tls/tls.h
+++ b/net/tls/tls.h
@@ -90,7 +90,8 @@ int tls_sk_attach(struct sock *sk, int optname, char __user *optval,
 		  unsigned int optlen);
 void tls_err_abort(struct sock *sk, int err);
 
-int tls_set_sw_offload(struct sock *sk, int tx);
+int tls_set_sw_offload(struct sock *sk, int tx,
+		       struct tls_crypto_info *new_crypto_info);
 void tls_update_rx_zc_capable(struct tls_context *tls_ctx);
 void tls_sw_strparser_arm(struct sock *sk, struct tls_context *ctx);
 void tls_sw_strparser_done(struct tls_context *tls_ctx);
diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
index c149f36b42ee..1ad50c253dfe 100644
--- a/net/tls/tls_device.c
+++ b/net/tls/tls_device.c
@@ -1291,7 +1291,7 @@ int tls_set_device_offload_rx(struct sock *sk, struct tls_context *ctx)
 	context->resync_nh_reset = 1;
 
 	ctx->priv_ctx_rx = context;
-	rc = tls_set_sw_offload(sk, 0);
+	rc = tls_set_sw_offload(sk, 0, NULL);
 	if (rc)
 		goto release_ctx;
 
diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index fb1da1780f50..24a4bdb54a53 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -669,9 +669,11 @@ static int tls_getsockopt(struct sock *sk, int level, int optname,
 static int do_tls_setsockopt_conf(struct sock *sk, sockptr_t optval,
 				  unsigned int optlen, int tx)
 {
-	struct tls_crypto_info *crypto_info;
-	struct tls_crypto_info *alt_crypto_info;
+	struct tls_crypto_info *crypto_info, *alt_crypto_info;
+	struct tls_crypto_info *old_crypto_info = NULL;
 	struct tls_context *ctx = tls_get_ctx(sk);
+	union tls_crypto_context tmp = {};
+	bool update = false;
 	size_t optsize;
 	int rc = 0;
 	int conf;
@@ -687,9 +689,17 @@ static int do_tls_setsockopt_conf(struct sock *sk, sockptr_t optval,
 		alt_crypto_info = &ctx->crypto_send.info;
 	}
 
-	/* Currently we don't support set crypto info more than one time */
-	if (TLS_CRYPTO_INFO_READY(crypto_info))
-		return -EBUSY;
+	if (TLS_CRYPTO_INFO_READY(crypto_info)) {
+		/* Currently we only support setting crypto info more
+		 * than one time for TLS 1.3
+		 */
+		if (crypto_info->version != TLS_1_3_VERSION)
+			return -EBUSY;
+
+		update = true;
+		old_crypto_info = crypto_info;
+		crypto_info = &tmp.info;
+	}
 
 	rc = copy_from_sockptr(crypto_info, optval, sizeof(*crypto_info));
 	if (rc) {
@@ -704,8 +714,15 @@ static int do_tls_setsockopt_conf(struct sock *sk, sockptr_t optval,
 		goto err_crypto_info;
 	}
 
-	/* Ensure that TLS version and ciphers are same in both directions */
-	if (TLS_CRYPTO_INFO_READY(alt_crypto_info)) {
+	if (update) {
+		/* Ensure that TLS version and ciphers are not modified */
+		if (crypto_info->version != old_crypto_info->version ||
+		    crypto_info->cipher_type != old_crypto_info->cipher_type) {
+			rc = -EINVAL;
+			goto err_crypto_info;
+		}
+	} else if (TLS_CRYPTO_INFO_READY(alt_crypto_info)) {
+		/* Ensure that TLS version and ciphers are same in both directions */
 		if (alt_crypto_info->version != crypto_info->version ||
 		    alt_crypto_info->cipher_type != crypto_info->cipher_type) {
 			rc = -EINVAL;
@@ -772,7 +789,8 @@ static int do_tls_setsockopt_conf(struct sock *sk, sockptr_t optval,
 			TLS_INC_STATS(sock_net(sk), LINUX_MIB_TLSTXDEVICE);
 			TLS_INC_STATS(sock_net(sk), LINUX_MIB_TLSCURRTXDEVICE);
 		} else {
-			rc = tls_set_sw_offload(sk, 1);
+			rc = tls_set_sw_offload(sk, 1,
+						update ? crypto_info : NULL);
 			if (rc)
 				goto err_crypto_info;
 			TLS_INC_STATS(sock_net(sk), LINUX_MIB_TLSTXSW);
@@ -786,7 +804,8 @@ static int do_tls_setsockopt_conf(struct sock *sk, sockptr_t optval,
 			TLS_INC_STATS(sock_net(sk), LINUX_MIB_TLSRXDEVICE);
 			TLS_INC_STATS(sock_net(sk), LINUX_MIB_TLSCURRRXDEVICE);
 		} else {
-			rc = tls_set_sw_offload(sk, 0);
+			rc = tls_set_sw_offload(sk, 0,
+						update ? crypto_info : NULL);
 			if (rc)
 				goto err_crypto_info;
 			TLS_INC_STATS(sock_net(sk), LINUX_MIB_TLSRXSW);
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 149a39d9a56a..a35b3bfe5b47 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -2508,23 +2508,50 @@ void tls_update_rx_zc_capable(struct tls_context *tls_ctx)
 		tls_ctx->prot_info.version != TLS_1_3_VERSION;
 }
 
-int tls_set_sw_offload(struct sock *sk, int tx)
+static void tls_finish_key_update(struct tls_context *tls_ctx)
+{
+	struct tls_sw_context_rx *ctx = tls_ctx->priv_ctx_rx;
+
+	ctx->key_update_pending = false;
+}
+
+int tls_set_sw_offload(struct sock *sk, int tx,
+		       struct tls_crypto_info *new_crypto_info)
 {
 	u16 nonce_size, tag_size, iv_size, rec_seq_size, salt_size;
+	struct tls_crypto_info *crypto_info, *src_crypto_info;
 	char *iv, *rec_seq, *key, *salt, *cipher_name;
 	struct tls_sw_context_tx *sw_ctx_tx = NULL;
 	struct tls_sw_context_rx *sw_ctx_rx = NULL;
 	struct tls_context *ctx = tls_get_ctx(sk);
-	struct tls_crypto_info *crypto_info;
+	size_t keysize, crypto_info_size;
 	struct cipher_context *cctx;
 	struct tls_prot_info *prot;
 	struct crypto_aead **aead;
 	struct crypto_tfm *tfm;
-	size_t keysize;
 	int rc = 0;
 
 	prot = &ctx->prot_info;
 
+	if (new_crypto_info) {
+		/* non-NULL new_crypto_info means rekey */
+		src_crypto_info = new_crypto_info;
+		if (tx) {
+			sw_ctx_tx = ctx->priv_ctx_tx;
+			crypto_info = &ctx->crypto_send.info;
+			cctx = &ctx->tx;
+			aead = &sw_ctx_tx->aead_send;
+			sw_ctx_tx = NULL;
+		} else {
+			sw_ctx_rx = ctx->priv_ctx_rx;
+			crypto_info = &ctx->crypto_recv.info;
+			cctx = &ctx->rx;
+			aead = &sw_ctx_rx->aead_recv;
+			sw_ctx_rx = NULL;
+		}
+		goto skip_init;
+	}
+
 	if (tx) {
 		if (!ctx->priv_ctx_tx) {
 			sw_ctx_tx = kzalloc(sizeof(*sw_ctx_tx), GFP_KERNEL);
@@ -2571,12 +2598,15 @@ int tls_set_sw_offload(struct sock *sk, int tx)
 		aead = &sw_ctx_rx->aead_recv;
 		sw_ctx_rx->key_update_pending = false;
 	}
+	src_crypto_info = crypto_info;
 
+skip_init:
 	switch (crypto_info->cipher_type) {
 	case TLS_CIPHER_AES_GCM_128: {
 		struct tls12_crypto_info_aes_gcm_128 *gcm_128_info;
 
-		gcm_128_info = (void *)crypto_info;
+		crypto_info_size = sizeof(struct tls12_crypto_info_aes_gcm_128);
+		gcm_128_info = (void *)src_crypto_info;
 		nonce_size = TLS_CIPHER_AES_GCM_128_IV_SIZE;
 		tag_size = TLS_CIPHER_AES_GCM_128_TAG_SIZE;
 		iv_size = TLS_CIPHER_AES_GCM_128_IV_SIZE;
@@ -2593,7 +2623,8 @@ int tls_set_sw_offload(struct sock *sk, int tx)
 	case TLS_CIPHER_AES_GCM_256: {
 		struct tls12_crypto_info_aes_gcm_256 *gcm_256_info;
 
-		gcm_256_info = (void *)crypto_info;
+		crypto_info_size = sizeof(struct tls12_crypto_info_aes_gcm_256);
+		gcm_256_info = (void *)src_crypto_info;
 		nonce_size = TLS_CIPHER_AES_GCM_256_IV_SIZE;
 		tag_size = TLS_CIPHER_AES_GCM_256_TAG_SIZE;
 		iv_size = TLS_CIPHER_AES_GCM_256_IV_SIZE;
@@ -2610,7 +2641,8 @@ int tls_set_sw_offload(struct sock *sk, int tx)
 	case TLS_CIPHER_AES_CCM_128: {
 		struct tls12_crypto_info_aes_ccm_128 *ccm_128_info;
 
-		ccm_128_info = (void *)crypto_info;
+		crypto_info_size = sizeof(struct tls12_crypto_info_aes_ccm_128);
+		ccm_128_info = (void *)src_crypto_info;
 		nonce_size = TLS_CIPHER_AES_CCM_128_IV_SIZE;
 		tag_size = TLS_CIPHER_AES_CCM_128_TAG_SIZE;
 		iv_size = TLS_CIPHER_AES_CCM_128_IV_SIZE;
@@ -2627,7 +2659,8 @@ int tls_set_sw_offload(struct sock *sk, int tx)
 	case TLS_CIPHER_CHACHA20_POLY1305: {
 		struct tls12_crypto_info_chacha20_poly1305 *chacha20_poly1305_info;
 
-		chacha20_poly1305_info = (void *)crypto_info;
+		crypto_info_size = sizeof(struct tls12_crypto_info_chacha20_poly1305);
+		chacha20_poly1305_info = (void *)src_crypto_info;
 		nonce_size = 0;
 		tag_size = TLS_CIPHER_CHACHA20_POLY1305_TAG_SIZE;
 		iv_size = TLS_CIPHER_CHACHA20_POLY1305_IV_SIZE;
@@ -2644,7 +2677,8 @@ int tls_set_sw_offload(struct sock *sk, int tx)
 	case TLS_CIPHER_SM4_GCM: {
 		struct tls12_crypto_info_sm4_gcm *sm4_gcm_info;
 
-		sm4_gcm_info = (void *)crypto_info;
+		crypto_info_size = sizeof(struct tls12_crypto_info_sm4_gcm);
+		sm4_gcm_info = (void *)src_crypto_info;
 		nonce_size = TLS_CIPHER_SM4_GCM_IV_SIZE;
 		tag_size = TLS_CIPHER_SM4_GCM_TAG_SIZE;
 		iv_size = TLS_CIPHER_SM4_GCM_IV_SIZE;
@@ -2661,7 +2695,8 @@ int tls_set_sw_offload(struct sock *sk, int tx)
 	case TLS_CIPHER_SM4_CCM: {
 		struct tls12_crypto_info_sm4_ccm *sm4_ccm_info;
 
-		sm4_ccm_info = (void *)crypto_info;
+		crypto_info_size = sizeof(struct tls12_crypto_info_sm4_ccm);
+		sm4_ccm_info = (void *)src_crypto_info;
 		nonce_size = TLS_CIPHER_SM4_CCM_IV_SIZE;
 		tag_size = TLS_CIPHER_SM4_CCM_TAG_SIZE;
 		iv_size = TLS_CIPHER_SM4_CCM_IV_SIZE;
@@ -2678,7 +2713,8 @@ int tls_set_sw_offload(struct sock *sk, int tx)
 	case TLS_CIPHER_ARIA_GCM_128: {
 		struct tls12_crypto_info_aria_gcm_128 *aria_gcm_128_info;
 
-		aria_gcm_128_info = (void *)crypto_info;
+		crypto_info_size = sizeof(struct tls12_crypto_info_aria_gcm_128);
+		aria_gcm_128_info = (void *)src_crypto_info;
 		nonce_size = TLS_CIPHER_ARIA_GCM_128_IV_SIZE;
 		tag_size = TLS_CIPHER_ARIA_GCM_128_TAG_SIZE;
 		iv_size = TLS_CIPHER_ARIA_GCM_128_IV_SIZE;
@@ -2695,7 +2731,8 @@ int tls_set_sw_offload(struct sock *sk, int tx)
 	case TLS_CIPHER_ARIA_GCM_256: {
 		struct tls12_crypto_info_aria_gcm_256 *gcm_256_info;
 
-		gcm_256_info = (void *)crypto_info;
+		crypto_info_size = sizeof(struct tls12_crypto_info_aria_gcm_256);
+		gcm_256_info = (void *)src_crypto_info;
 		nonce_size = TLS_CIPHER_ARIA_GCM_256_IV_SIZE;
 		tag_size = TLS_CIPHER_ARIA_GCM_256_TAG_SIZE;
 		iv_size = TLS_CIPHER_ARIA_GCM_256_IV_SIZE;
@@ -2739,19 +2776,18 @@ int tls_set_sw_offload(struct sock *sk, int tx)
 			      prot->tag_size + prot->tail_size;
 	prot->iv_size = iv_size;
 	prot->salt_size = salt_size;
-	cctx->iv = kmalloc(iv_size + salt_size, GFP_KERNEL);
-	if (!cctx->iv) {
-		rc = -ENOMEM;
-		goto free_priv;
-	}
-	/* Note: 128 & 256 bit salt are the same size */
-	prot->rec_seq_size = rec_seq_size;
-	memcpy(cctx->iv, salt, salt_size);
-	memcpy(cctx->iv + salt_size, iv, iv_size);
-	cctx->rec_seq = kmemdup(rec_seq, rec_seq_size, GFP_KERNEL);
-	if (!cctx->rec_seq) {
-		rc = -ENOMEM;
-		goto free_iv;
+	if (!new_crypto_info) {
+		cctx->iv = kmalloc(iv_size + salt_size, GFP_KERNEL);
+		if (!cctx->iv) {
+			rc = -ENOMEM;
+			goto free_priv;
+		}
+
+		cctx->rec_seq = kmemdup(rec_seq, rec_seq_size, GFP_KERNEL);
+		if (!cctx->rec_seq) {
+			rc = -ENOMEM;
+			goto free_iv;
+		}
 	}
 
 	if (!*aead) {
@@ -2765,14 +2801,24 @@ int tls_set_sw_offload(struct sock *sk, int tx)
 
 	ctx->push_pending_record = tls_sw_push_pending_record;
 
+	/* setkey is the last operation that could fail during a
+	 * rekey. if it succeeds, we can start modifying the
+	 * context.
+	 */
 	rc = crypto_aead_setkey(*aead, key, keysize);
+	if (rc) {
+		if (new_crypto_info)
+			goto out;
+		else
+			goto free_aead;
+	}
 
-	if (rc)
-		goto free_aead;
-
-	rc = crypto_aead_setauthsize(*aead, prot->tag_size);
-	if (rc)
-		goto free_aead;
+	if (!new_crypto_info) {
+		rc = crypto_aead_setauthsize(*aead, prot->tag_size);
+		if (rc) {
+			goto free_aead;
+		}
+	}
 
 	if (sw_ctx_rx) {
 		tfm = crypto_aead_tfm(sw_ctx_rx->aead_recv);
@@ -2787,6 +2833,20 @@ int tls_set_sw_offload(struct sock *sk, int tx)
 			goto free_aead;
 	}
 
+	/* Note: 128 & 256 bit salt are the same size */
+	prot->rec_seq_size = rec_seq_size;
+	memcpy(cctx->iv, salt, salt_size);
+	memcpy(cctx->iv + salt_size, iv, iv_size);
+
+	if (new_crypto_info) {
+		memcpy(cctx->rec_seq, rec_seq, rec_seq_size);
+
+		memcpy(crypto_info, new_crypto_info, crypto_info_size);
+		memzero_explicit(new_crypto_info, crypto_info_size);
+		if (!tx)
+			tls_finish_key_update(ctx);
+	}
+
 	goto out;
 
 free_aead:
@@ -2799,12 +2859,14 @@ int tls_set_sw_offload(struct sock *sk, int tx)
 	kfree(cctx->iv);
 	cctx->iv = NULL;
 free_priv:
-	if (tx) {
-		kfree(ctx->priv_ctx_tx);
-		ctx->priv_ctx_tx = NULL;
-	} else {
-		kfree(ctx->priv_ctx_rx);
-		ctx->priv_ctx_rx = NULL;
+	if (!new_crypto_info) {
+		if (tx) {
+			kfree(ctx->priv_ctx_tx);
+			ctx->priv_ctx_tx = NULL;
+		} else {
+			kfree(ctx->priv_ctx_rx);
+			ctx->priv_ctx_rx = NULL;
+		}
 	}
 out:
 	return rc;
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH net-next v2 4/5] selftests: tls: add key_generation argument to tls_crypto_info_init
  2023-02-14 11:17 [PATCH net-next v2 0/5] tls: implement key updates for TLS1.3 Sabrina Dubroca
                   ` (2 preceding siblings ...)
  2023-02-14 11:17 ` [PATCH net-next v2 3/5] tls: implement rekey for TLS1.3 Sabrina Dubroca
@ 2023-02-14 11:17 ` Sabrina Dubroca
  2023-02-14 11:17 ` [PATCH net-next v2 5/5] selftests: tls: add rekey tests Sabrina Dubroca
  2023-02-15  5:08 ` [PATCH net-next v2 0/5] tls: implement key updates for TLS1.3 Jakub Kicinski
  5 siblings, 0 replies; 21+ messages in thread
From: Sabrina Dubroca @ 2023-02-14 11:17 UTC (permalink / raw)
  To: netdev
  Cc: Sabrina Dubroca, Vadim Fedorenko, Frantisek Krenzelok,
	Jakub Kicinski, Kuniyuki Iwashima, Apoorv Kothari,
	Boris Pismenny, John Fastabend, Shuah Khan, linux-kselftest,
	Gal Pressman, Marcel Holtmann

This allows us to generate different keys, so that we can test that
rekey is using the correct one.

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
 tools/testing/selftests/net/tls.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/net/tls.c b/tools/testing/selftests/net/tls.c
index 2cbb12736596..5f3adb28fee1 100644
--- a/tools/testing/selftests/net/tls.c
+++ b/tools/testing/selftests/net/tls.c
@@ -38,9 +38,11 @@ struct tls_crypto_info_keys {
 };
 
 static void tls_crypto_info_init(uint16_t tls_version, uint16_t cipher_type,
-				 struct tls_crypto_info_keys *tls12)
+				 struct tls_crypto_info_keys *tls12,
+				 char key_generation)
 {
-	memset(tls12, 0, sizeof(*tls12));
+	memset(tls12, key_generation, sizeof(*tls12));
+	memset(tls12, 0, sizeof(struct tls_crypto_info));
 
 	switch (cipher_type) {
 	case TLS_CIPHER_CHACHA20_POLY1305:
@@ -312,7 +314,7 @@ FIXTURE_SETUP(tls)
 	int ret;
 
 	tls_crypto_info_init(variant->tls_version, variant->cipher_type,
-			     &tls12);
+			     &tls12, 0);
 
 	ulp_sock_pair(_metadata, &self->fd, &self->cfd, &self->notls);
 
@@ -1071,7 +1073,7 @@ TEST_F(tls, bidir)
 		struct tls_crypto_info_keys tls12;
 
 		tls_crypto_info_init(variant->tls_version, variant->cipher_type,
-				     &tls12);
+				     &tls12, 0);
 
 		ret = setsockopt(self->fd, SOL_TLS, TLS_RX, &tls12,
 				 tls12.len);
@@ -1479,7 +1481,7 @@ FIXTURE_SETUP(tls_err)
 	int ret;
 
 	tls_crypto_info_init(variant->tls_version, TLS_CIPHER_AES_GCM_128,
-			     &tls12);
+			     &tls12, 0);
 
 	ulp_sock_pair(_metadata, &self->fd, &self->cfd, &self->notls);
 	ulp_sock_pair(_metadata, &self->fd2, &self->cfd2, &self->notls);
@@ -1771,7 +1773,7 @@ TEST(tls_v6ops) {
 	int sfd, ret, fd;
 	socklen_t len, len2;
 
-	tls_crypto_info_init(TLS_1_2_VERSION, TLS_CIPHER_AES_GCM_128, &tls12);
+	tls_crypto_info_init(TLS_1_2_VERSION, TLS_CIPHER_AES_GCM_128, &tls12, 0);
 
 	addr.sin6_family = AF_INET6;
 	addr.sin6_addr = in6addr_any;
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH net-next v2 5/5] selftests: tls: add rekey tests
  2023-02-14 11:17 [PATCH net-next v2 0/5] tls: implement key updates for TLS1.3 Sabrina Dubroca
                   ` (3 preceding siblings ...)
  2023-02-14 11:17 ` [PATCH net-next v2 4/5] selftests: tls: add key_generation argument to tls_crypto_info_init Sabrina Dubroca
@ 2023-02-14 11:17 ` Sabrina Dubroca
  2023-02-15  5:08 ` [PATCH net-next v2 0/5] tls: implement key updates for TLS1.3 Jakub Kicinski
  5 siblings, 0 replies; 21+ messages in thread
From: Sabrina Dubroca @ 2023-02-14 11:17 UTC (permalink / raw)
  To: netdev
  Cc: Sabrina Dubroca, Vadim Fedorenko, Frantisek Krenzelok,
	Jakub Kicinski, Kuniyuki Iwashima, Apoorv Kothari,
	Boris Pismenny, John Fastabend, Shuah Khan, linux-kselftest,
	Gal Pressman, Marcel Holtmann

v2: add rekey_fail test (reject changing the version/cipher)

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
 tools/testing/selftests/net/tls.c | 322 ++++++++++++++++++++++++++++++
 1 file changed, 322 insertions(+)

diff --git a/tools/testing/selftests/net/tls.c b/tools/testing/selftests/net/tls.c
index 5f3adb28fee1..098d52859521 100644
--- a/tools/testing/selftests/net/tls.c
+++ b/tools/testing/selftests/net/tls.c
@@ -1453,6 +1453,328 @@ TEST_F(tls, shutdown_reuse)
 	EXPECT_EQ(errno, EISCONN);
 }
 
+#define TLS_RECORD_TYPE_HANDSHAKE      0x16
+/* key_update, length 1, update_not_requested */
+static const char key_update_msg[] = "\x18\x00\x00\x01\x00";
+static void tls_send_keyupdate(struct __test_metadata *_metadata, int fd)
+{
+	size_t len = sizeof(key_update_msg);
+
+	EXPECT_EQ(tls_send_cmsg(fd, TLS_RECORD_TYPE_HANDSHAKE,
+				(char *)key_update_msg, len, 0),
+		  len);
+}
+
+static void tls_recv_keyupdate(struct __test_metadata *_metadata, int fd, int flags)
+{
+	char buf[100];
+
+	EXPECT_EQ(tls_recv_cmsg(_metadata, fd, TLS_RECORD_TYPE_HANDSHAKE, buf, sizeof(buf), flags),
+		  sizeof(key_update_msg));
+	EXPECT_EQ(memcmp(buf, key_update_msg, sizeof(key_update_msg)), 0);
+}
+
+/* set the key to 0 then 1 for RX, immediately to 1 for TX */
+TEST_F(tls_basic, rekey_rx)
+{
+	struct tls_crypto_info_keys tls12_0, tls12_1;
+	char const *test_str = "test_message";
+	int send_len = strlen(test_str) + 1;
+	char buf[20];
+	int ret;
+
+	if (self->notls)
+		return;
+
+	tls_crypto_info_init(TLS_1_3_VERSION, TLS_CIPHER_AES_GCM_128,
+			     &tls12_0, 0);
+	tls_crypto_info_init(TLS_1_3_VERSION, TLS_CIPHER_AES_GCM_128,
+			     &tls12_1, 1);
+
+
+	ret = setsockopt(self->fd, SOL_TLS, TLS_TX, &tls12_1, tls12_1.len);
+	ASSERT_EQ(ret, 0);
+
+	ret = setsockopt(self->cfd, SOL_TLS, TLS_RX, &tls12_0, tls12_0.len);
+	ASSERT_EQ(ret, 0);
+
+	ret = setsockopt(self->cfd, SOL_TLS, TLS_RX, &tls12_1, tls12_1.len);
+	EXPECT_EQ(ret, 0);
+
+	EXPECT_EQ(send(self->fd, test_str, send_len, 0), send_len);
+	EXPECT_EQ(recv(self->cfd, buf, send_len, 0), send_len);
+	EXPECT_EQ(memcmp(buf, test_str, send_len), 0);
+}
+
+/* set the key to 0 then 1 for TX, immediately to 1 for RX */
+TEST_F(tls_basic, rekey_tx)
+{
+	struct tls_crypto_info_keys tls12_0, tls12_1;
+	char const *test_str = "test_message";
+	int send_len = strlen(test_str) + 1;
+	char buf[20];
+	int ret;
+
+	if (self->notls)
+		return;
+
+	tls_crypto_info_init(TLS_1_3_VERSION, TLS_CIPHER_AES_GCM_128,
+			     &tls12_0, 0);
+	tls_crypto_info_init(TLS_1_3_VERSION, TLS_CIPHER_AES_GCM_128,
+			     &tls12_1, 1);
+
+
+	ret = setsockopt(self->fd, SOL_TLS, TLS_TX, &tls12_0, tls12_0.len);
+	ASSERT_EQ(ret, 0);
+
+	ret = setsockopt(self->cfd, SOL_TLS, TLS_RX, &tls12_1, tls12_1.len);
+	ASSERT_EQ(ret, 0);
+
+	ret = setsockopt(self->fd, SOL_TLS, TLS_TX, &tls12_1, tls12_1.len);
+	EXPECT_EQ(ret, 0);
+
+	EXPECT_EQ(send(self->fd, test_str, send_len, 0), send_len);
+	EXPECT_EQ(recv(self->cfd, buf, send_len, 0), send_len);
+	EXPECT_EQ(memcmp(buf, test_str, send_len), 0);
+}
+
+TEST_F(tls, rekey)
+{
+	char const *test_str_1 = "test_message_before_rekey";
+	char const *test_str_2 = "test_message_after_rekey";
+	struct tls_crypto_info_keys tls12;
+	int send_len;
+	char buf[100];
+
+	if (variant->tls_version != TLS_1_3_VERSION)
+		return;
+
+	/* initial send/recv */
+	send_len = strlen(test_str_1) + 1;
+	EXPECT_EQ(send(self->fd, test_str_1, send_len, 0), send_len);
+	EXPECT_EQ(recv(self->cfd, buf, send_len, 0), send_len);
+	EXPECT_EQ(memcmp(buf, test_str_1, send_len), 0);
+
+	/* update TX key */
+	tls_send_keyupdate(_metadata, self->fd);
+	tls_crypto_info_init(variant->tls_version, variant->cipher_type, &tls12, 1);
+	EXPECT_EQ(setsockopt(self->fd, SOL_TLS, TLS_TX, &tls12, tls12.len), 0);
+
+	/* send after rekey */
+	send_len = strlen(test_str_2) + 1;
+	EXPECT_EQ(send(self->fd, test_str_2, send_len, 0), send_len);
+
+	/* can't receive the KeyUpdate without a control message */
+	EXPECT_EQ(recv(self->cfd, buf, send_len, 0), -1);
+
+	/* get KeyUpdate */
+	tls_recv_keyupdate(_metadata, self->cfd, 0);
+
+	/* recv blocking -> -EKEYEXPIRED */
+	EXPECT_EQ(recv(self->cfd, buf, sizeof(buf), 0), -1);
+	EXPECT_EQ(errno, EKEYEXPIRED);
+
+	/* recv non-blocking -> -EKEYEXPIRED */
+	EXPECT_EQ(recv(self->cfd, buf, sizeof(buf), MSG_DONTWAIT), -1);
+	EXPECT_EQ(errno, EKEYEXPIRED);
+
+	/* update RX key */
+	EXPECT_EQ(setsockopt(self->cfd, SOL_TLS, TLS_RX, &tls12, tls12.len), 0);
+
+	/* recv after rekey */
+	EXPECT_NE(recv(self->cfd, buf, send_len, 0), -1);
+	EXPECT_EQ(memcmp(buf, test_str_2, send_len), 0);
+}
+
+TEST_F(tls, rekey_fail)
+{
+	char const *test_str_1 = "test_message_before_rekey";
+	char const *test_str_2 = "test_message_after_rekey";
+	struct tls_crypto_info_keys tls12;
+	int send_len;
+	char buf[100];
+
+	if (variant->tls_version != TLS_1_3_VERSION)
+		return;
+
+	/* initial send/recv */
+	send_len = strlen(test_str_1) + 1;
+	EXPECT_EQ(send(self->fd, test_str_1, send_len, 0), send_len);
+	EXPECT_EQ(recv(self->cfd, buf, send_len, 0), send_len);
+	EXPECT_EQ(memcmp(buf, test_str_1, send_len), 0);
+
+	/* update TX key */
+	tls_send_keyupdate(_metadata, self->fd);
+
+	/* successful update */
+	tls_crypto_info_init(variant->tls_version, variant->cipher_type, &tls12, 1);
+	EXPECT_EQ(setsockopt(self->fd, SOL_TLS, TLS_TX, &tls12, tls12.len), 0);
+
+	/* invalid update: change of version */
+	tls_crypto_info_init(TLS_1_2_VERSION, variant->cipher_type, &tls12, 1);
+	EXPECT_EQ(setsockopt(self->fd, SOL_TLS, TLS_TX, &tls12, tls12.len), -1);
+	EXPECT_EQ(errno, EINVAL);
+
+	/* invalid update: change of cipher */
+	if (variant->cipher_type == TLS_CIPHER_AES_GCM_256)
+		tls_crypto_info_init(variant->tls_version, TLS_CIPHER_CHACHA20_POLY1305, &tls12, 1);
+	else
+		tls_crypto_info_init(variant->tls_version, TLS_CIPHER_AES_GCM_256, &tls12, 1);
+	EXPECT_EQ(setsockopt(self->fd, SOL_TLS, TLS_TX, &tls12, tls12.len), -1);
+	EXPECT_EQ(errno, EINVAL);
+
+	/* send after rekey, the invalid updates shouldn't have an effect */
+	send_len = strlen(test_str_2) + 1;
+	EXPECT_EQ(send(self->fd, test_str_2, send_len, 0), send_len);
+
+	/* can't receive the KeyUpdate without a control message */
+	EXPECT_EQ(recv(self->cfd, buf, send_len, 0), -1);
+
+	/* get KeyUpdate */
+	tls_recv_keyupdate(_metadata, self->cfd, 0);
+
+	/* recv blocking -> -EKEYEXPIRED */
+	EXPECT_EQ(recv(self->cfd, buf, sizeof(buf), 0), -1);
+	EXPECT_EQ(errno, EKEYEXPIRED);
+
+	/* recv non-blocking -> -EKEYEXPIRED */
+	EXPECT_EQ(recv(self->cfd, buf, sizeof(buf), MSG_DONTWAIT), -1);
+	EXPECT_EQ(errno, EKEYEXPIRED);
+
+	/* update RX key */
+	tls_crypto_info_init(variant->tls_version, variant->cipher_type, &tls12, 1);
+	EXPECT_EQ(setsockopt(self->cfd, SOL_TLS, TLS_RX, &tls12, tls12.len), 0);
+
+	/* recv after rekey */
+	EXPECT_NE(recv(self->cfd, buf, send_len, 0), -1);
+	EXPECT_EQ(memcmp(buf, test_str_2, send_len), 0);
+}
+
+TEST_F(tls, rekey_peek)
+{
+	char const *test_str_1 = "test_message_before_rekey";
+	struct tls_crypto_info_keys tls12;
+	int send_len;
+	char buf[100];
+
+	if (variant->tls_version != TLS_1_3_VERSION)
+		return;
+
+	send_len = strlen(test_str_1) + 1;
+	EXPECT_EQ(send(self->fd, test_str_1, send_len, 0), send_len);
+
+	/* update TX key */
+	tls_send_keyupdate(_metadata, self->fd);
+	tls_crypto_info_init(variant->tls_version, variant->cipher_type, &tls12, 1);
+	EXPECT_EQ(setsockopt(self->fd, SOL_TLS, TLS_TX, &tls12, tls12.len), 0);
+
+	EXPECT_EQ(recv(self->cfd, buf, sizeof(buf), MSG_PEEK), send_len);
+	EXPECT_EQ(memcmp(buf, test_str_1, send_len), 0);
+
+	EXPECT_EQ(recv(self->cfd, buf, send_len, 0), send_len);
+	EXPECT_EQ(memcmp(buf, test_str_1, send_len), 0);
+
+	/* can't receive the KeyUpdate without a control message */
+	EXPECT_EQ(recv(self->cfd, buf, send_len, MSG_PEEK), -1);
+
+	/* peek KeyUpdate */
+	tls_recv_keyupdate(_metadata, self->cfd, MSG_PEEK);
+
+	/* get KeyUpdate */
+	tls_recv_keyupdate(_metadata, self->cfd, 0);
+
+	/* update RX key */
+	EXPECT_EQ(setsockopt(self->cfd, SOL_TLS, TLS_RX, &tls12, tls12.len), 0);
+}
+
+TEST_F(tls, splice_rekey)
+{
+	int send_len = TLS_PAYLOAD_MAX_LEN / 2;
+	char mem_send[TLS_PAYLOAD_MAX_LEN];
+	char mem_recv[TLS_PAYLOAD_MAX_LEN];
+	struct tls_crypto_info_keys tls12;
+	int p[2];
+
+	if (variant->tls_version != TLS_1_3_VERSION)
+		return;
+
+	memrnd(mem_send, sizeof(mem_send));
+
+	ASSERT_GE(pipe(p), 0);
+	EXPECT_EQ(send(self->fd, mem_send, send_len, 0), send_len);
+
+	/* update TX key */
+	tls_send_keyupdate(_metadata, self->fd);
+	tls_crypto_info_init(variant->tls_version, variant->cipher_type, &tls12, 1);
+	EXPECT_EQ(setsockopt(self->fd, SOL_TLS, TLS_TX, &tls12, tls12.len), 0);
+
+	EXPECT_EQ(send(self->fd, mem_send, send_len, 0), send_len);
+
+	EXPECT_EQ(splice(self->cfd, NULL, p[1], NULL, TLS_PAYLOAD_MAX_LEN, 0), send_len);
+	EXPECT_EQ(read(p[0], mem_recv, send_len), send_len);
+	EXPECT_EQ(memcmp(mem_send, mem_recv, send_len), 0);
+
+	/* can't splice the KeyUpdate */
+	EXPECT_EQ(splice(self->cfd, NULL, p[1], NULL, TLS_PAYLOAD_MAX_LEN, 0), -1);
+	EXPECT_EQ(errno, EINVAL);
+
+	/* peek KeyUpdate */
+	tls_recv_keyupdate(_metadata, self->cfd, MSG_PEEK);
+
+	/* get KeyUpdate */
+	tls_recv_keyupdate(_metadata, self->cfd, 0);
+
+	/* can't splice before updating the key */
+	EXPECT_EQ(splice(self->cfd, NULL, p[1], NULL, TLS_PAYLOAD_MAX_LEN, 0), -1);
+	EXPECT_EQ(errno, EKEYEXPIRED);
+
+	/* update RX key */
+	EXPECT_EQ(setsockopt(self->cfd, SOL_TLS, TLS_RX, &tls12, tls12.len), 0);
+
+	EXPECT_EQ(splice(self->cfd, NULL, p[1], NULL, TLS_PAYLOAD_MAX_LEN, 0), send_len);
+	EXPECT_EQ(read(p[0], mem_recv, send_len), send_len);
+	EXPECT_EQ(memcmp(mem_send, mem_recv, send_len), 0);
+}
+
+TEST_F(tls, rekey_getsockopt)
+{
+	struct tls_crypto_info_keys tls12;
+	struct tls_crypto_info_keys tls12_get;
+	socklen_t len;
+
+	tls_crypto_info_init(variant->tls_version, variant->cipher_type, &tls12, 0);
+
+	len = tls12.len;
+	EXPECT_EQ(getsockopt(self->fd, SOL_TLS, TLS_TX, &tls12_get, &len), 0);
+	EXPECT_EQ(len, tls12.len);
+	EXPECT_EQ(memcmp(&tls12_get, &tls12, tls12.len), 0);
+
+	len = tls12.len;
+	EXPECT_EQ(getsockopt(self->cfd, SOL_TLS, TLS_RX, &tls12_get, &len), 0);
+	EXPECT_EQ(len, tls12.len);
+	EXPECT_EQ(memcmp(&tls12_get, &tls12, tls12.len), 0);
+
+	if (variant->tls_version != TLS_1_3_VERSION)
+		return;
+
+	tls_send_keyupdate(_metadata, self->fd);
+	tls_crypto_info_init(variant->tls_version, variant->cipher_type, &tls12, 1);
+	EXPECT_EQ(setsockopt(self->fd, SOL_TLS, TLS_TX, &tls12, tls12.len), 0);
+
+	tls_recv_keyupdate(_metadata, self->cfd, 0);
+	EXPECT_EQ(setsockopt(self->cfd, SOL_TLS, TLS_RX, &tls12, tls12.len), 0);
+
+	len = tls12.len;
+	EXPECT_EQ(getsockopt(self->fd, SOL_TLS, TLS_TX, &tls12_get, &len), 0);
+	EXPECT_EQ(len, tls12.len);
+	EXPECT_EQ(memcmp(&tls12_get, &tls12, tls12.len), 0);
+
+	len = tls12.len;
+	EXPECT_EQ(getsockopt(self->cfd, SOL_TLS, TLS_RX, &tls12_get, &len), 0);
+	EXPECT_EQ(len, tls12.len);
+	EXPECT_EQ(memcmp(&tls12_get, &tls12, tls12.len), 0);
+}
+
 FIXTURE(tls_err)
 {
 	int fd, cfd;
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next v2 0/5] tls: implement key updates for TLS1.3
  2023-02-14 11:17 [PATCH net-next v2 0/5] tls: implement key updates for TLS1.3 Sabrina Dubroca
                   ` (4 preceding siblings ...)
  2023-02-14 11:17 ` [PATCH net-next v2 5/5] selftests: tls: add rekey tests Sabrina Dubroca
@ 2023-02-15  5:08 ` Jakub Kicinski
  2023-02-15 17:29   ` Sabrina Dubroca
  5 siblings, 1 reply; 21+ messages in thread
From: Jakub Kicinski @ 2023-02-15  5:08 UTC (permalink / raw)
  To: Sabrina Dubroca
  Cc: netdev, Vadim Fedorenko, Frantisek Krenzelok, Kuniyuki Iwashima,
	Apoorv Kothari, Boris Pismenny, John Fastabend, Shuah Khan,
	linux-kselftest, Gal Pressman, Marcel Holtmann

On Tue, 14 Feb 2023 12:17:37 +0100 Sabrina Dubroca wrote:
> Changes in v2
> use reverse xmas tree ordering in tls_set_sw_offload and
> do_tls_setsockopt_conf
> turn the alt_crypto_info into an else if
> selftests: add rekey_fail test
> 
> Vadim suggested simplifying tls_set_sw_offload by copying the new
> crypto_info in the context in do_tls_setsockopt_conf, and then
> detecting the rekey in tls_set_sw_offload based on whether the iv was
> already set, but I don't think we can have a common error path
> (otherwise we'd free the aead etc on rekey failure). I decided instead
> to reorganize tls_set_sw_offload so that the context is unmodified
> until we know the rekey cannot fail. Some fields will be touched
> during the rekey, but they will be set to the same value they had
> before the rekey (prot->rec_seq_size, etc).
> 
> Apoorv suggested to name the struct tls_crypto_info_keys "tls13"
> rather than "tls12". Since we're using the same crypto_info data for
> TLS1.3 as for 1.2, even if the tests only run for TLS1.3, I'd rather
> keep the "tls12" name, in case we end up adding a
> "tls13_crypto_info_aes_gcm_128" type in the future.
> 
> Kuniyuki and Apoorv also suggested preventing rekeys on RX when we
> haven't received a matching KeyUpdate message, but I'd rather let
> userspace handle this and have a symmetric API between TX and RX on
> the kernel side. It's a bit of a foot-gun, but we can't really stop a
> broken userspace from rolling back the rec_seq on an existing
> crypto_info either, and that seems like a worse possible breakage.

And how will we handle re-keying in offload?

>  include/net/tls.h                 |   4 +
>  net/tls/tls.h                     |   3 +-
>  net/tls/tls_device.c              |   2 +-
>  net/tls/tls_main.c                |  37 +++-
>  net/tls/tls_sw.c                  | 189 +++++++++++++----
>  tools/testing/selftests/net/tls.c | 336 +++++++++++++++++++++++++++++-

Documentation please.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next v2 2/5] tls: block decryption when a rekey is pending
  2023-02-14 11:17 ` [PATCH net-next v2 2/5] tls: block decryption when a rekey is pending Sabrina Dubroca
@ 2023-02-15  5:09   ` Jakub Kicinski
  2023-02-15 17:37     ` Sabrina Dubroca
  0 siblings, 1 reply; 21+ messages in thread
From: Jakub Kicinski @ 2023-02-15  5:09 UTC (permalink / raw)
  To: Sabrina Dubroca
  Cc: netdev, Vadim Fedorenko, Frantisek Krenzelok, Kuniyuki Iwashima,
	Apoorv Kothari, Boris Pismenny, John Fastabend, Shuah Khan,
	linux-kselftest, Gal Pressman, Marcel Holtmann

On Tue, 14 Feb 2023 12:17:39 +0100 Sabrina Dubroca wrote:
> @@ -2141,6 +2178,12 @@ ssize_t tls_sw_splice_read(struct socket *sock,  loff_t *ppos,
>  	if (err < 0)
>  		return err;
>  
> +	/* a rekey is pending, let userspace deal with it */
> +	if (unlikely(ctx->key_update_pending)) {
> +		err = -EKEYEXPIRED;
> +		goto splice_read_end;
> +	}

This will prevent splicing peek()'ed data.
Just put the check in tls_rx_rec_wait().

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next v2 0/5] tls: implement key updates for TLS1.3
  2023-02-15  5:08 ` [PATCH net-next v2 0/5] tls: implement key updates for TLS1.3 Jakub Kicinski
@ 2023-02-15 17:29   ` Sabrina Dubroca
  2023-02-15 19:10     ` Jakub Kicinski
  0 siblings, 1 reply; 21+ messages in thread
From: Sabrina Dubroca @ 2023-02-15 17:29 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, Vadim Fedorenko, Frantisek Krenzelok, Kuniyuki Iwashima,
	Apoorv Kothari, Boris Pismenny, John Fastabend, Shuah Khan,
	linux-kselftest, Gal Pressman, Marcel Holtmann

2023-02-14, 21:08:11 -0800, Jakub Kicinski wrote:
> On Tue, 14 Feb 2023 12:17:37 +0100 Sabrina Dubroca wrote:
> > Changes in v2
> > use reverse xmas tree ordering in tls_set_sw_offload and
> > do_tls_setsockopt_conf
> > turn the alt_crypto_info into an else if
> > selftests: add rekey_fail test
> > 
> > Vadim suggested simplifying tls_set_sw_offload by copying the new
> > crypto_info in the context in do_tls_setsockopt_conf, and then
> > detecting the rekey in tls_set_sw_offload based on whether the iv was
> > already set, but I don't think we can have a common error path
> > (otherwise we'd free the aead etc on rekey failure). I decided instead
> > to reorganize tls_set_sw_offload so that the context is unmodified
> > until we know the rekey cannot fail. Some fields will be touched
> > during the rekey, but they will be set to the same value they had
> > before the rekey (prot->rec_seq_size, etc).
> > 
> > Apoorv suggested to name the struct tls_crypto_info_keys "tls13"
> > rather than "tls12". Since we're using the same crypto_info data for
> > TLS1.3 as for 1.2, even if the tests only run for TLS1.3, I'd rather
> > keep the "tls12" name, in case we end up adding a
> > "tls13_crypto_info_aes_gcm_128" type in the future.
> > 
> > Kuniyuki and Apoorv also suggested preventing rekeys on RX when we
> > haven't received a matching KeyUpdate message, but I'd rather let
> > userspace handle this and have a symmetric API between TX and RX on
> > the kernel side. It's a bit of a foot-gun, but we can't really stop a
> > broken userspace from rolling back the rec_seq on an existing
> > crypto_info either, and that seems like a worse possible breakage.
> 
> And how will we handle re-keying in offload?

Sorry for the stupid question... do you mean that I need to solve that
problem before this series can progress, or that the cover letter
should summarize the state of the discussion?

> >  include/net/tls.h                 |   4 +
> >  net/tls/tls.h                     |   3 +-
> >  net/tls/tls_device.c              |   2 +-
> >  net/tls/tls_main.c                |  37 +++-
> >  net/tls/tls_sw.c                  | 189 +++++++++++++----
> >  tools/testing/selftests/net/tls.c | 336 +++++++++++++++++++++++++++++-
> 
> Documentation please.

Ok, I'll add a paragraph to Documentation/networking/tls.rst in the
next version.

-- 
Sabrina


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next v2 2/5] tls: block decryption when a rekey is pending
  2023-02-15  5:09   ` Jakub Kicinski
@ 2023-02-15 17:37     ` Sabrina Dubroca
  0 siblings, 0 replies; 21+ messages in thread
From: Sabrina Dubroca @ 2023-02-15 17:37 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, Vadim Fedorenko, Frantisek Krenzelok, Kuniyuki Iwashima,
	Apoorv Kothari, Boris Pismenny, John Fastabend, Shuah Khan,
	linux-kselftest, Gal Pressman, Marcel Holtmann

2023-02-14, 21:09:25 -0800, Jakub Kicinski wrote:
> On Tue, 14 Feb 2023 12:17:39 +0100 Sabrina Dubroca wrote:
> > @@ -2141,6 +2178,12 @@ ssize_t tls_sw_splice_read(struct socket *sock,  loff_t *ppos,
> >  	if (err < 0)
> >  		return err;
> >  
> > +	/* a rekey is pending, let userspace deal with it */
> > +	if (unlikely(ctx->key_update_pending)) {
> > +		err = -EKEYEXPIRED;
> > +		goto splice_read_end;
> > +	}
> 
> This will prevent splicing peek()'ed data.
> Just put the check in tls_rx_rec_wait().

Ok, I'll do that and add a selftest for this sequence of syscalls.

-- 
Sabrina


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next v2 0/5] tls: implement key updates for TLS1.3
  2023-02-15 17:29   ` Sabrina Dubroca
@ 2023-02-15 19:10     ` Jakub Kicinski
  2023-02-15 23:23       ` Sabrina Dubroca
  0 siblings, 1 reply; 21+ messages in thread
From: Jakub Kicinski @ 2023-02-15 19:10 UTC (permalink / raw)
  To: Sabrina Dubroca
  Cc: netdev, Vadim Fedorenko, Frantisek Krenzelok, Kuniyuki Iwashima,
	Apoorv Kothari, Boris Pismenny, John Fastabend, Shuah Khan,
	linux-kselftest, Gal Pressman, Marcel Holtmann

On Wed, 15 Feb 2023 18:29:50 +0100 Sabrina Dubroca wrote:
> > And how will we handle re-keying in offload?  
> 
> Sorry for the stupid question... do you mean that I need to solve that
> problem before this series can progress, or that the cover letter
> should summarize the state of the discussion?

I maintain that 1.3 offload is much more important than rekeying.
Offloads being available for 1.2 may be stalling adoption of 1.3
(just a guess, I run across this article mentioning 1.2 being used
in Oracle cloud for instance:
https://blogs.oracle.com/cloudsecurity/post/how-oci-helps-you-protect-data-with-default-encryption
could be because MITM requirements, or maybe they have HW which
can only do 1.2? Dunno).

But I'm willing to compromise, we just need a solid plan of how to
handle the inevitable. I'm worried that how this will pay out is:
 - you don't care about offload and add rekey
 - vendors don't care about rekey and add 1.3
  ... time passes ...
 - both you and the vendors have moved on
 - users run into issues, waste their time debugging and
   eventually report the problem upstream
 - it's on me to fix?

:(

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next v2 0/5] tls: implement key updates for TLS1.3
  2023-02-15 19:10     ` Jakub Kicinski
@ 2023-02-15 23:23       ` Sabrina Dubroca
  2023-02-16  3:57         ` Jakub Kicinski
  0 siblings, 1 reply; 21+ messages in thread
From: Sabrina Dubroca @ 2023-02-15 23:23 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, Vadim Fedorenko, Frantisek Krenzelok, Kuniyuki Iwashima,
	Apoorv Kothari, Boris Pismenny, John Fastabend, Shuah Khan,
	linux-kselftest, Gal Pressman, Marcel Holtmann

2023-02-15, 11:10:20 -0800, Jakub Kicinski wrote:
> On Wed, 15 Feb 2023 18:29:50 +0100 Sabrina Dubroca wrote:
> > > And how will we handle re-keying in offload?  
> > 
> > Sorry for the stupid question... do you mean that I need to solve that
> > problem before this series can progress, or that the cover letter
> > should summarize the state of the discussion?
> 
> I maintain that 1.3 offload is much more important than rekeying.

Sure, it'd be great if we had 1.3 offload (and it should also support
rekeying, because you can't force the peer to not rekey). But I can't
implement offload.

> Offloads being available for 1.2 may be stalling adoption of 1.3
> (just a guess, I run across this article mentioning 1.2 being used
> in Oracle cloud for instance:
> https://blogs.oracle.com/cloudsecurity/post/how-oci-helps-you-protect-data-with-default-encryption
> could be because MITM requirements, or maybe they have HW which
> can only do 1.2? Dunno).
> 
> But I'm willing to compromise, we just need a solid plan of how to
> handle the inevitable. I'm worried that how this will pay out is:
>  - you don't care about offload and add rekey

I think that's a bit unfair. Not having to deal with offload at all
would make things easier for me, sure, but I'm open to the discussion,
even if I don't have a good understanding of the offloading side.

I'd just like to avoid holding this feature (arguably a bug fix) until
the vendors finally decide that they care about 1.3, if possible. If
not, so be it.

I wasn't trying to force you to accept this series. Sorry if that's
what it sounded like. I really wanted to understand what you were
asking for, because your question wasn't clear to me. Now it makes
sense.

>  - vendors don't care about rekey and add 1.3
>   ... time passes ...
>  - both you and the vendors have moved on
>  - users run into issues, waste their time debugging and
>    eventually report the problem upstream
>  - it's on me to fix?
> 
> :(

Yeah, I see. If the rekey already exists in SW, I think it'll be a bit
harder for them to just not care about it, but maybe I'm being
optimistic.

I'm not sure we can come up with the correct uAPI/rekey design without
trying to implement rekey with offload and seeing how that blows up
(and possibly in different ways with different devices).

Picking up from where the discussion died off in the previous thread:

On transmit, I think the software fallback for retransmits will be
needed, whether we can keep two generations of keys on the device or
just one. We could have 2 consecutive rekeys, without even worrying
about a broken peer spamming key updates for both sides (or the local
user's library doing that). If devices can juggle 3 generations of
keys, then maybe we don't have to worry too much about software
fallback, but we'll need to define an API to set the extra keys ahead
of time and then advance to the next one. Will all devices support
installing 2 or 3 keys?

On receive, we also have the problem of more than one rekey arriving,
so if we can't feed enough keys to the device in advance, we'll have
to decrypt some records in software. The host will have to survive the
burst of software decryption while we wait until the device config
catches up.

One option might be to do the key derivation in the kernel following
section 7.2 of the RFC [1]. I don't know how happy crypto/security
people would be with that. We'd have to introduce new crypto_info
structs, and new cipher types (or a flag in the upper bits of the
cipher type) to go with them. Then the kernel processes incoming key
update messages on its own, and emits its own key update messages when
its current key is expiring. On transmit we also need to inject a
Finished message before the KeyUpdate [2]. That's bringing a lot of
TLS logic in the kernel. At that point we might as well do the whole
handshake... but I really hope it doesn't come to that.

It's hard, but I don't think "let's just kill the connection" is a
nice solution. It's definitely easier for the kernel.

[1] https://www.rfc-editor.org/rfc/rfc8446#section-7.2
[2] https://www.rfc-editor.org/rfc/rfc8446#section-4.6.3

-- 
Sabrina


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next v2 0/5] tls: implement key updates for TLS1.3
  2023-02-15 23:23       ` Sabrina Dubroca
@ 2023-02-16  3:57         ` Jakub Kicinski
  2023-02-16 16:23           ` Sabrina Dubroca
  0 siblings, 1 reply; 21+ messages in thread
From: Jakub Kicinski @ 2023-02-16  3:57 UTC (permalink / raw)
  To: Sabrina Dubroca
  Cc: netdev, Vadim Fedorenko, Frantisek Krenzelok, Kuniyuki Iwashima,
	Apoorv Kothari, Boris Pismenny, John Fastabend, Shuah Khan,
	linux-kselftest, Gal Pressman, Marcel Holtmann

On Thu, 16 Feb 2023 00:23:11 +0100 Sabrina Dubroca wrote:
> > Offloads being available for 1.2 may be stalling adoption of 1.3
> > (just a guess, I run across this article mentioning 1.2 being used
> > in Oracle cloud for instance:
> > https://blogs.oracle.com/cloudsecurity/post/how-oci-helps-you-protect-data-with-default-encryption
> > could be because MITM requirements, or maybe they have HW which
> > can only do 1.2? Dunno).
> > 
> > But I'm willing to compromise, we just need a solid plan of how to
> > handle the inevitable. I'm worried that how this will pay out is:
> >  - you don't care about offload and add rekey  
> 
> I think that's a bit unfair. Not having to deal with offload at all
> would make things easier for me, sure, but I'm open to the discussion,
> even if I don't have a good understanding of the offloading side.
> 
> I'd just like to avoid holding this feature (arguably a bug fix) until
> the vendors finally decide that they care about 1.3, if possible. If
> not, so be it.
> 
> I wasn't trying to force you to accept this series. Sorry if that's
> what it sounded like. I really wanted to understand what you were
> asking for, because your question wasn't clear to me. Now it makes
> sense.
> 
> >  - vendors don't care about rekey and add 1.3
> >   ... time passes ...
> >  - both you and the vendors have moved on
> >  - users run into issues, waste their time debugging and
> >    eventually report the problem upstream
> >  - it's on me to fix?
> > 
> > :(  
> 
> Yeah, I see. If the rekey already exists in SW, I think it'll be a bit
> harder for them to just not care about it, but maybe I'm being
> optimistic.

True, they may try to weasel out / require some pushing and support.
Depends on which vendor gets to it first, I guess.

> I'm not sure we can come up with the correct uAPI/rekey design without
> trying to implement rekey with offload and seeing how that blows up
> (and possibly in different ways with different devices).

Yes, best we can do now is have a plan in place... and your promise 
of future help? :) (incl. being on the lookout for when the patches 
come because I'll probably forget)

> Picking up from where the discussion died off in the previous thread:
> 
> On transmit, I think the software fallback for retransmits will be
> needed, whether we can keep two generations of keys on the device or
> just one. We could have 2 consecutive rekeys, without even worrying
> about a broken peer spamming key updates for both sides (or the local
> user's library doing that). If devices can juggle 3 generations of
> keys, then maybe we don't have to worry too much about software
> fallback, but we'll need to define an API to set the extra keys ahead
> of time and then advance to the next one. Will all devices support
> installing 2 or 3 keys?

I think we could try to switch to SW crypto on Tx until all data using
old key is ACK'ed, drivers can look at skb->decrypted to skip touching
the transitional skbs. Then remove old key, install new one, resume
offload.

We may need special care to make sure we don't try to encrypt the same
packet with both keys. In case a rtx gets stuck somewhere and comes to
the NIC after it's already acked (happens surprisingly often).

Multiple keys on the device would probably mean the device needs some
intelligence to know when to use which - not my first choice.

> On receive, we also have the problem of more than one rekey arriving,
> so if we can't feed enough keys to the device in advance, we'll have
> to decrypt some records in software. The host will have to survive the
> burst of software decryption while we wait until the device config
> catches up.

I think receive is easier. The fallback is quite effective and already
in place. Here too we may want to enforce some transitional SW-only
mode to avoid the (highly unlikely?) case that NIC will decrypt
successfully a packet with the old key, even tho new key should be used.
Carrying "key ID" with the skb is probably an overkill.

> One option might be to do the key derivation in the kernel following
> section 7.2 of the RFC [1]. I don't know how happy crypto/security
> people would be with that. We'd have to introduce new crypto_info
> structs, and new cipher types (or a flag in the upper bits of the
> cipher type) to go with them. Then the kernel processes incoming key
> update messages on its own, and emits its own key update messages when
> its current key is expiring. On transmit we also need to inject a
> Finished message before the KeyUpdate [2]. That's bringing a lot of
> TLS logic in the kernel. At that point we might as well do the whole
> handshake... but I really hope it doesn't come to that.

I think it's mostly a device vs host state sharing problem, so TLS ULP
or user space - not a big difference, both are on the host.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next v2 0/5] tls: implement key updates for TLS1.3
  2023-02-16  3:57         ` Jakub Kicinski
@ 2023-02-16 16:23           ` Sabrina Dubroca
  2023-02-22  3:19             ` Jakub Kicinski
  0 siblings, 1 reply; 21+ messages in thread
From: Sabrina Dubroca @ 2023-02-16 16:23 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, Vadim Fedorenko, Frantisek Krenzelok, Kuniyuki Iwashima,
	Apoorv Kothari, Boris Pismenny, John Fastabend, Shuah Khan,
	linux-kselftest, Gal Pressman, Marcel Holtmann

2023-02-15, 19:57:48 -0800, Jakub Kicinski wrote:
> On Thu, 16 Feb 2023 00:23:11 +0100 Sabrina Dubroca wrote:
> > > Offloads being available for 1.2 may be stalling adoption of 1.3
> > > (just a guess, I run across this article mentioning 1.2 being used
> > > in Oracle cloud for instance:
> > > https://blogs.oracle.com/cloudsecurity/post/how-oci-helps-you-protect-data-with-default-encryption
> > > could be because MITM requirements, or maybe they have HW which
> > > can only do 1.2? Dunno).
> > > 
> > > But I'm willing to compromise, we just need a solid plan of how to
> > > handle the inevitable. I'm worried that how this will pay out is:
> > >  - you don't care about offload and add rekey  
> > 
> > I think that's a bit unfair. Not having to deal with offload at all
> > would make things easier for me, sure, but I'm open to the discussion,
> > even if I don't have a good understanding of the offloading side.
> > 
> > I'd just like to avoid holding this feature (arguably a bug fix) until
> > the vendors finally decide that they care about 1.3, if possible. If
> > not, so be it.
> > 
> > I wasn't trying to force you to accept this series. Sorry if that's
> > what it sounded like. I really wanted to understand what you were
> > asking for, because your question wasn't clear to me. Now it makes
> > sense.
> > 
> > >  - vendors don't care about rekey and add 1.3
> > >   ... time passes ...
> > >  - both you and the vendors have moved on
> > >  - users run into issues, waste their time debugging and
> > >    eventually report the problem upstream
> > >  - it's on me to fix?
> > > 
> > > :(  
> > 
> > Yeah, I see. If the rekey already exists in SW, I think it'll be a bit
> > harder for them to just not care about it, but maybe I'm being
> > optimistic.
> 
> True, they may try to weasel out / require some pushing and support.
> Depends on which vendor gets to it first, I guess.
> 
> > I'm not sure we can come up with the correct uAPI/rekey design without
> > trying to implement rekey with offload and seeing how that blows up
> > (and possibly in different ways with different devices).
> 
> Yes, best we can do now is have a plan in place... and your promise 
> of future help? :) (incl. being on the lookout for when the patches 
> come because I'll probably forget)

I'll try to help. None of us can guarantee that we'll be around :)

> > Picking up from where the discussion died off in the previous thread:
> > 
> > On transmit, I think the software fallback for retransmits will be
> > needed, whether we can keep two generations of keys on the device or
> > just one. We could have 2 consecutive rekeys, without even worrying
> > about a broken peer spamming key updates for both sides (or the local
> > user's library doing that). If devices can juggle 3 generations of
> > keys, then maybe we don't have to worry too much about software
> > fallback, but we'll need to define an API to set the extra keys ahead
> > of time and then advance to the next one. Will all devices support
> > installing 2 or 3 keys?
> 
> I think we could try to switch to SW crypto on Tx until all data using
> old key is ACK'ed, drivers can look at skb->decrypted to skip touching
> the transitional skbs. Then remove old key, install new one, resume
> offload.

"all data using the old key" needs to be one list of record per old
key, since we can have multiple rekeys.

Could we install the new key in HW a bit earlier? Keep the old key as
SW fallback for rtx, but the driver installs the new key when the
corresponding KeyUpdate record has gone through and tells the stack to
stop doing SW crypto? I'm not sure that'd be a significant improvement
in the standard case, though.

> We may need special care to make sure we don't try to encrypt the same
> packet with both keys. In case a rtx gets stuck somewhere and comes to
> the NIC after it's already acked (happens surprisingly often).

Don't we have that already? If there's a retransmit while we're
setting the TX key in HW, data that was queued on the socket before
(and shouldn't be encrypted at all) would also be encrypted
otherwise. Or is it different with rekey?

> Multiple keys on the device would probably mean the device needs some
> intelligence to know when to use which - not my first choice.

I only mentioned that because Boris talked about it in the previous
thread.  It's also not my first choice because I doubt all devices
will support it the same way.

> > On receive, we also have the problem of more than one rekey arriving,
> > so if we can't feed enough keys to the device in advance, we'll have
> > to decrypt some records in software. The host will have to survive the
> > burst of software decryption while we wait until the device config
> > catches up.
> 
> I think receive is easier. The fallback is quite effective and already
> in place.

I was thinking about a peer sending at line rate just after the rekey,
and we have to handle that in software for a short while. If the peer
is also dropping to a software fallback, I guess there's no issue,
both sides will slow down.

> Here too we may want to enforce some transitional SW-only
> mode to avoid the (highly unlikely?) case that NIC will decrypt
> successfully a packet with the old key, even tho new key should be used.

Not a crypto expert, but I don't think that's a realistic thing to
worry about.

If the NIC uses the old key to decrypt a record that was encrypted
with the new key and fails, we end up seeing the encrypted record and
decrypting it in SW, right? (and then we can pick the correct key in
the SW fallback) We don't just tear down the connection?

> Carrying "key ID" with the skb is probably an overkill.

I think we can retrieve it from the sequence of records. When we see a
KeyUpdate, we advance the "key ID" after we're done with the message.


This makes me wonder again if we should have fake offloads on veth
(still calling the kernel's crypto library to simulate a device doing
the encryption and/or decryption), to make it easy to play with the
software bits, without requiring actual hardware that can offload
TLS/IPsec/MACsec. But maybe it's too complex to be useful and we'd
just waste our time fixing bugs in the fake offload rather than
improving the stack.

> > One option might be to do the key derivation in the kernel following
> > section 7.2 of the RFC [1]. I don't know how happy crypto/security
> > people would be with that. We'd have to introduce new crypto_info
> > structs, and new cipher types (or a flag in the upper bits of the
> > cipher type) to go with them. Then the kernel processes incoming key
> > update messages on its own, and emits its own key update messages when
> > its current key is expiring. On transmit we also need to inject a
> > Finished message before the KeyUpdate [2]. That's bringing a lot of
> > TLS logic in the kernel. At that point we might as well do the whole
> > handshake... but I really hope it doesn't come to that.
> 
> I think it's mostly a device vs host state sharing problem, so TLS ULP
> or user space - not a big difference, both are on the host.

Maybe. But that's one more implementation of TLS (or large parts of
it) that admins and security teams have to worry about. Maintained by
networking people. I really want to avoid going down that route.


I'll try to dig more into what you're proposing and to poke some holes
into it over the next few days.

-- 
Sabrina


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next v2 0/5] tls: implement key updates for TLS1.3
  2023-02-16 16:23           ` Sabrina Dubroca
@ 2023-02-22  3:19             ` Jakub Kicinski
  2023-02-23 16:27               ` Sabrina Dubroca
  0 siblings, 1 reply; 21+ messages in thread
From: Jakub Kicinski @ 2023-02-22  3:19 UTC (permalink / raw)
  To: Sabrina Dubroca
  Cc: netdev, Vadim Fedorenko, Frantisek Krenzelok, Kuniyuki Iwashima,
	Apoorv Kothari, Boris Pismenny, John Fastabend, Shuah Khan,
	linux-kselftest, Gal Pressman, Marcel Holtmann

Sorry for the delay, long weekend + merge window.

On Thu, 16 Feb 2023 17:23:19 +0100 Sabrina Dubroca wrote:
> 2023-02-15, 19:57:48 -0800, Jakub Kicinski wrote:
> > On Thu, 16 Feb 2023 00:23:11 +0100 Sabrina Dubroca wrote:  
> > > I'm not sure we can come up with the correct uAPI/rekey design without
> > > trying to implement rekey with offload and seeing how that blows up
> > > (and possibly in different ways with different devices).  
> > 
> > Yes, best we can do now is have a plan in place... and your promise 
> > of future help? :) (incl. being on the lookout for when the patches 
> > come because I'll probably forget)  
> 
> I'll try to help. None of us can guarantee that we'll be around :)

Right.

> > I think we could try to switch to SW crypto on Tx until all data using
> > old key is ACK'ed, drivers can look at skb->decrypted to skip touching
> > the transitional skbs. Then remove old key, install new one, resume
> > offload.  
> 
> "all data using the old key" needs to be one list of record per old
> key, since we can have multiple rekeys.

No fully parsing this bit.

> Could we install the new key in HW a bit earlier? Keep the old key as
> SW fallback for rtx, but the driver installs the new key when the
> corresponding KeyUpdate record has gone through and tells the stack to
> stop doing SW crypto? I'm not sure that'd be a significant improvement
> in the standard case, though.

Important consideration is making the non-rekey path as fast as
possible (given rekeying is extremely rare). Looking at skb->decrypted
should be very fast but we can possibly fit some other indication of
"are we rekeying" into another already referenced cache line.
We definitely don't want to have to look up the record to know what
state we're in.

The fallback can't use AES-NI (it's in sirq context) so it's slower 
than SW encrypt before queuing to TCP. Hence my first thought is using
SW crypto for new key and let the traffic we already queued with old
key drain leveraging HW crypto. But as I said the impact on performance
when not rekeying is more important, and so is driver simplicity.

> > We may need special care to make sure we don't try to encrypt the same
> > packet with both keys. In case a rtx gets stuck somewhere and comes to
> > the NIC after it's already acked (happens surprisingly often).  
> 
> Don't we have that already? If there's a retransmit while we're
> setting the TX key in HW, data that was queued on the socket before
> (and shouldn't be encrypted at all) would also be encrypted
> otherwise. Or is it different with rekey?

We have a "start marker" record which is supposed to indicate that
anything before it has already been encrypted. The driver is programmed
with the start seq no, when it sees a packet from before this seq no
it checks if a record exists, finds its before the start marker and
sends the data as is.

> > Multiple keys on the device would probably mean the device needs some
> > intelligence to know when to use which - not my first choice.  
> 
> I only mentioned that because Boris talked about it in the previous
> thread.  It's also not my first choice because I doubt all devices
> will support it the same way.
> 
> > > On receive, we also have the problem of more than one rekey arriving,
> > > so if we can't feed enough keys to the device in advance, we'll have
> > > to decrypt some records in software. The host will have to survive the
> > > burst of software decryption while we wait until the device config
> > > catches up.  
> > 
> > I think receive is easier. The fallback is quite effective and already
> > in place.  
> 
> I was thinking about a peer sending at line rate just after the rekey,
> and we have to handle that in software for a short while. If the peer
> is also dropping to a software fallback, I guess there's no issue,
> both sides will slow down.
> 
> > Here too we may want to enforce some transitional SW-only
> > mode to avoid the (highly unlikely?) case that NIC will decrypt
> > successfully a packet with the old key, even tho new key should be used.  
> 
> Not a crypto expert, but I don't think that's a realistic thing to
> worry about.
> 
> If the NIC uses the old key to decrypt a record that was encrypted
> with the new key and fails, we end up seeing the encrypted record and
> decrypting it in SW, right? (and then we can pick the correct key in
> the SW fallback) We don't just tear down the connection?

Right, in the common case it should just work. We're basically up
against the probability of a hash collision on the auth tag (match 
with both old and new key).

> > Carrying "key ID" with the skb is probably an overkill.  
> 
> I think we can retrieve it from the sequence of records. When we see a
> KeyUpdate, we advance the "key ID" after we're done with the message.
> 
> 
> This makes me wonder again if we should have fake offloads on veth
> (still calling the kernel's crypto library to simulate a device doing
> the encryption and/or decryption), to make it easy to play with the
> software bits, without requiring actual hardware that can offload
> TLS/IPsec/MACsec. But maybe it's too complex to be useful and we'd
> just waste our time fixing bugs in the fake offload rather than
> improving the stack.

It should be quite useful. I also usually just hack up veth, but 
I reckon adding support to netdevsim would be a better fit.
We just need a way to tell two netdevsim ports to "connect to each
other".

> > > One option might be to do the key derivation in the kernel following
> > > section 7.2 of the RFC [1]. I don't know how happy crypto/security
> > > people would be with that. We'd have to introduce new crypto_info
> > > structs, and new cipher types (or a flag in the upper bits of the
> > > cipher type) to go with them. Then the kernel processes incoming key
> > > update messages on its own, and emits its own key update messages when
> > > its current key is expiring. On transmit we also need to inject a
> > > Finished message before the KeyUpdate [2]. That's bringing a lot of
> > > TLS logic in the kernel. At that point we might as well do the whole
> > > handshake... but I really hope it doesn't come to that.  
> > 
> > I think it's mostly a device vs host state sharing problem, so TLS ULP
> > or user space - not a big difference, both are on the host.  
> 
> Maybe. But that's one more implementation of TLS (or large parts of
> it) that admins and security teams have to worry about. Maintained by
> networking people. I really want to avoid going down that route.
> 
> 
> I'll try to dig more into what you're proposing and to poke some holes
> into it over the next few days.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next v2 0/5] tls: implement key updates for TLS1.3
  2023-02-22  3:19             ` Jakub Kicinski
@ 2023-02-23 16:27               ` Sabrina Dubroca
  2023-02-23 17:29                 ` Jakub Kicinski
  0 siblings, 1 reply; 21+ messages in thread
From: Sabrina Dubroca @ 2023-02-23 16:27 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, Vadim Fedorenko, Frantisek Krenzelok, Kuniyuki Iwashima,
	Apoorv Kothari, Boris Pismenny, John Fastabend, Shuah Khan,
	linux-kselftest, Gal Pressman, Marcel Holtmann

2023-02-21, 19:19:44 -0800, Jakub Kicinski wrote:
> Sorry for the delay, long weekend + merge window.

No worries, I wasn't expecting much activity on this from you during
the merge window.

> On Thu, 16 Feb 2023 17:23:19 +0100 Sabrina Dubroca wrote:
> > 2023-02-15, 19:57:48 -0800, Jakub Kicinski wrote:
> > > I think we could try to switch to SW crypto on Tx until all data using
> > > old key is ACK'ed, drivers can look at skb->decrypted to skip touching
> > > the transitional skbs. Then remove old key, install new one, resume
> > > offload.  
> > 
> > "all data using the old key" needs to be one list of record per old
> > key, since we can have multiple rekeys.
> 
> No fully parsing this bit.

We can have multiple rekeys in the time it takes to get an ACK for the
first KeyUpdate message to be ACK'ed. I'm not sure why I talked about
a "list of records".

But we could have this sequence of records:

  recN(k1,hwenc)
  KeyUpdate(k1,hwenc)
  // switch to k2 and sw crypto

  rec0(k2,swenc)
  rec1(k2,swenc)
  KeyUpdate(k2,swenc)
  rec0(k3,swenc)
  // receive ACK for KU1, don't enable HW offload for k2 or k3 because we've already switched off k2

  rec1(k3,swenc)
  // receive ACK for KU2, now enable HW offload for k3

  rec2(k3,hwenc)

So we'll need to record the most recent TX rekey, and wait until the
corresponding KU record is ACK'ed, before we resume offload using the
most recent key (and skip possible intermediate keys).

Installing the key in HW and re-enabling the offload will need to
happen via the icsk_clean_acked callback. We'll need a workqueue so
that we don't actually talk to the driver from softirq.

Then, we have to handle a failure to install the key. Since we're not
installing it in HW immediately during setsockopt, notifying userspace
of a rekey failure is more complicated. Maybe we can do a
rekey_prepare during the setsocktopt, and then the actual rekey is an
operation that cannot fail?

> > Could we install the new key in HW a bit earlier? Keep the old key as
> > SW fallback for rtx, but the driver installs the new key when the
> > corresponding KeyUpdate record has gone through and tells the stack to
> > stop doing SW crypto? I'm not sure that'd be a significant improvement
> > in the standard case, though.
> 
> Important consideration is making the non-rekey path as fast as
> possible (given rekeying is extremely rare). Looking at skb->decrypted
> should be very fast but we can possibly fit some other indication of
> "are we rekeying" into another already referenced cache line.
> We definitely don't want to have to look up the record to know what
> state we're in.
> 
> The fallback can't use AES-NI (it's in sirq context) so it's slower 
> than SW encrypt before queuing to TCP. Hence my first thought is using
> SW crypto for new key and let the traffic we already queued with old
> key drain leveraging HW crypto. But as I said the impact on performance
> when not rekeying is more important, and so is driver simplicity.

Right, sorry, full tls_sw path and not the existing fallback.

Changing the socket ops back and forth between the HW and SW variants
worries me, because we only lock the socket once we have entered
tls_{device,sw}_sendmsg. So I think we have to stay on the _device ops
even during the SW crypto phase of the rekey, and let that call into
the SW variant after locking the socket and making sure we're in a
rekey.

> > > We may need special care to make sure we don't try to encrypt the same
> > > packet with both keys. In case a rtx gets stuck somewhere and comes to
> > > the NIC after it's already acked (happens surprisingly often).  
> > 
> > Don't we have that already? If there's a retransmit while we're
> > setting the TX key in HW, data that was queued on the socket before
> > (and shouldn't be encrypted at all) would also be encrypted
> > otherwise. Or is it different with rekey?
> 
> We have a "start marker" record which is supposed to indicate that
> anything before it has already been encrypted. The driver is programmed
> with the start seq no, when it sees a packet from before this seq no
> it checks if a record exists, finds its before the start marker and
> sends the data as is.

Yes, I was looking into that earlier this week. I think we could reuse
a similar mechanism for rekeying. tls_dev_add takes tcp_sk->write_seq,
we could have a tls_dev_rekey op passing the new key and new write_seq
to the driver. I think we can also reuse the ->eor trick from
tls_set_device_offload, and we wouldn't have to look at
skb->decrypted. Close and push the current SW record, mark ->eor, pass
write_seq to the driver along with the key. Also pretty close to what
tls_device_resync_tx does.

[...]
> > This makes me wonder again if we should have fake offloads on veth
> > (still calling the kernel's crypto library to simulate a device doing
> > the encryption and/or decryption), to make it easy to play with the
> > software bits, without requiring actual hardware that can offload
> > TLS/IPsec/MACsec. But maybe it's too complex to be useful and we'd
> > just waste our time fixing bugs in the fake offload rather than
> > improving the stack.
> 
> It should be quite useful. I also usually just hack up veth, but 
> I reckon adding support to netdevsim would be a better fit.
> We just need a way to tell two netdevsim ports to "connect to each
> other".

Oh, nice idea. I'll add that to my todo list.

-- 
Sabrina


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next v2 0/5] tls: implement key updates for TLS1.3
  2023-02-23 16:27               ` Sabrina Dubroca
@ 2023-02-23 17:29                 ` Jakub Kicinski
  2023-03-13 15:41                   ` Sabrina Dubroca
  0 siblings, 1 reply; 21+ messages in thread
From: Jakub Kicinski @ 2023-02-23 17:29 UTC (permalink / raw)
  To: Sabrina Dubroca
  Cc: netdev, Vadim Fedorenko, Frantisek Krenzelok, Kuniyuki Iwashima,
	Apoorv Kothari, Boris Pismenny, John Fastabend, Shuah Khan,
	linux-kselftest, Gal Pressman, Marcel Holtmann

On Thu, 23 Feb 2023 17:27:40 +0100 Sabrina Dubroca wrote:
> 2023-02-21, 19:19:44 -0800, Jakub Kicinski wrote:
> > > "all data using the old key" needs to be one list of record per old
> > > key, since we can have multiple rekeys.  
> > 
> > No fully parsing this bit.  
> 
> We can have multiple rekeys in the time it takes to get an ACK for the
> first KeyUpdate message to be ACK'ed. I'm not sure why I talked about
> a "list of records".
> 
> But we could have this sequence of records:
> 
>   recN(k1,hwenc)
>   KeyUpdate(k1,hwenc)
>   // switch to k2 and sw crypto
> 
>   rec0(k2,swenc)
>   rec1(k2,swenc)
>   KeyUpdate(k2,swenc)
>   rec0(k3,swenc)
>   // receive ACK for KU1, don't enable HW offload for k2 or k3 because we've already switched off k2
> 
>   rec1(k3,swenc)
>   // receive ACK for KU2, now enable HW offload for k3
> 
>   rec2(k3,hwenc)
> 
> So we'll need to record the most recent TX rekey, and wait until the
> corresponding KU record is ACK'ed, before we resume offload using the
> most recent key (and skip possible intermediate keys).
> 
> Installing the key in HW and re-enabling the offload will need to
> happen via the icsk_clean_acked callback. We'll need a workqueue so
> that we don't actually talk to the driver from softirq.

Installing from icsk_clean_acked won't win us anything, right?
We'll only need the key once the next sendmsg() comes, what's
pushed to TCP with swenc is already out of our hands.

> Then, we have to handle a failure to install the key. Since we're not
> installing it in HW immediately during setsockopt, notifying userspace
> of a rekey failure is more complicated. Maybe we can do a
> rekey_prepare during the setsocktopt, and then the actual rekey is an
> operation that cannot fail?

TLS offload silently falls back to SW on any errors. So that's fine.
Just bump a counter. User/infra must be tracking error counters in 
our current design already.

> > Important consideration is making the non-rekey path as fast as
> > possible (given rekeying is extremely rare). Looking at skb->decrypted
> > should be very fast but we can possibly fit some other indication of
> > "are we rekeying" into another already referenced cache line.
> > We definitely don't want to have to look up the record to know what
> > state we're in.
> > 
> > The fallback can't use AES-NI (it's in sirq context) so it's slower 
> > than SW encrypt before queuing to TCP. Hence my first thought is using
> > SW crypto for new key and let the traffic we already queued with old
> > key drain leveraging HW crypto. But as I said the impact on performance
> > when not rekeying is more important, and so is driver simplicity.  
> 
> Right, sorry, full tls_sw path and not the existing fallback.
> 
> Changing the socket ops back and forth between the HW and SW variants
> worries me, because we only lock the socket once we have entered
> tls_{device,sw}_sendmsg. So I think we have to stay on the _device ops
> even during the SW crypto phase of the rekey, and let that call into
> the SW variant after locking the socket and making sure we're in a
> rekey.

Fair point :S

> > > Don't we have that already? If there's a retransmit while we're
> > > setting the TX key in HW, data that was queued on the socket before
> > > (and shouldn't be encrypted at all) would also be encrypted
> > > otherwise. Or is it different with rekey?  
> > 
> > We have a "start marker" record which is supposed to indicate that
> > anything before it has already been encrypted. The driver is programmed
> > with the start seq no, when it sees a packet from before this seq no
> > it checks if a record exists, finds its before the start marker and
> > sends the data as is.  
> 
> Yes, I was looking into that earlier this week. I think we could reuse
> a similar mechanism for rekeying. tls_dev_add takes tcp_sk->write_seq,
> we could have a tls_dev_rekey op passing the new key and new write_seq
> to the driver. I think we can also reuse the ->eor trick from
> tls_set_device_offload, and we wouldn't have to look at
> skb->decrypted. Close and push the current SW record, mark ->eor, pass
> write_seq to the driver along with the key. Also pretty close to what
> tls_device_resync_tx does.

That sounds like you'd expose the rekeying logic to the drivers?
New op, having to track seq#...


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next v2 0/5] tls: implement key updates for TLS1.3
  2023-02-23 17:29                 ` Jakub Kicinski
@ 2023-03-13 15:41                   ` Sabrina Dubroca
  2023-03-13 18:35                     ` Jakub Kicinski
  0 siblings, 1 reply; 21+ messages in thread
From: Sabrina Dubroca @ 2023-03-13 15:41 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, Vadim Fedorenko, Frantisek Krenzelok, Kuniyuki Iwashima,
	Apoorv Kothari, Boris Pismenny, John Fastabend, Shuah Khan,
	linux-kselftest, Gal Pressman, Marcel Holtmann

2023-02-23, 09:29:45 -0800, Jakub Kicinski wrote:
> On Thu, 23 Feb 2023 17:27:40 +0100 Sabrina Dubroca wrote:
> > Installing the key in HW and re-enabling the offload will need to
> > happen via the icsk_clean_acked callback. We'll need a workqueue so
> > that we don't actually talk to the driver from softirq.
> 
> Installing from icsk_clean_acked won't win us anything, right?
> We'll only need the key once the next sendmsg() comes, what's
> pushed to TCP with swenc is already out of our hands.

Avoiding an unpredictable slowdown on the sendmsg() call? We can deal
with that later if it turns out to be an issue. I simply didn't think
of deferring to the next sendmsg().

> > Then, we have to handle a failure to install the key. Since we're not
> > installing it in HW immediately during setsockopt, notifying userspace
> > of a rekey failure is more complicated. Maybe we can do a
> > rekey_prepare during the setsocktopt, and then the actual rekey is an
> > operation that cannot fail?
> 
> TLS offload silently falls back to SW on any errors. So that's fine.
> Just bump a counter. User/infra must be tracking error counters in 
> our current design already.

True. User might be a bit surprised by "well it was offloaded and now
it's not", but ok.

> > > Important consideration is making the non-rekey path as fast as
> > > possible (given rekeying is extremely rare). Looking at skb->decrypted
> > > should be very fast but we can possibly fit some other indication of
> > > "are we rekeying" into another already referenced cache line.
> > > We definitely don't want to have to look up the record to know what
> > > state we're in.
> > > 
> > > The fallback can't use AES-NI (it's in sirq context) so it's slower 
> > > than SW encrypt before queuing to TCP. Hence my first thought is using
> > > SW crypto for new key and let the traffic we already queued with old
> > > key drain leveraging HW crypto. But as I said the impact on performance
> > > when not rekeying is more important, and so is driver simplicity.  
> > 
> > Right, sorry, full tls_sw path and not the existing fallback.
> > 
> > Changing the socket ops back and forth between the HW and SW variants
> > worries me, because we only lock the socket once we have entered
> > tls_{device,sw}_sendmsg. So I think we have to stay on the _device ops
> > even during the SW crypto phase of the rekey, and let that call into
> > the SW variant after locking the socket and making sure we're in a
> > rekey.
> 
> Fair point :S
> 
> > > > Don't we have that already? If there's a retransmit while we're
> > > > setting the TX key in HW, data that was queued on the socket before
> > > > (and shouldn't be encrypted at all) would also be encrypted
> > > > otherwise. Or is it different with rekey?  
> > > 
> > > We have a "start marker" record which is supposed to indicate that
> > > anything before it has already been encrypted. The driver is programmed
> > > with the start seq no, when it sees a packet from before this seq no
> > > it checks if a record exists, finds its before the start marker and
> > > sends the data as is.  
> > 
> > Yes, I was looking into that earlier this week. I think we could reuse
> > a similar mechanism for rekeying. tls_dev_add takes tcp_sk->write_seq,
> > we could have a tls_dev_rekey op passing the new key and new write_seq
> > to the driver. I think we can also reuse the ->eor trick from
> > tls_set_device_offload, and we wouldn't have to look at
> > skb->decrypted. Close and push the current SW record, mark ->eor, pass
> > write_seq to the driver along with the key. Also pretty close to what
> > tls_device_resync_tx does.
> 
> That sounds like you'd expose the rekeying logic to the drivers?
> New op, having to track seq#...

Well, we have to call into the drivers to install the key, whether
that's a new rekey op, or adding an update argument to ->tls_dev_add,
or letting the driver guess that it's a rekey (or ignore that and just
install the key if rekey vs initial key isn't a meaningful
distinction).

We already feed drivers the seq# with ->tls_dev_add, so passing it for
rekeys as well is not a big change.

Does that seem problematic? Adding a rekey op seemed more natural to
me than simply using the existing _del + _add ops, but maybe we can
get away with just using those two ops.

-- 
Sabrina


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next v2 0/5] tls: implement key updates for TLS1.3
  2023-03-13 15:41                   ` Sabrina Dubroca
@ 2023-03-13 18:35                     ` Jakub Kicinski
  2023-03-22 16:10                       ` Sabrina Dubroca
  0 siblings, 1 reply; 21+ messages in thread
From: Jakub Kicinski @ 2023-03-13 18:35 UTC (permalink / raw)
  To: Sabrina Dubroca
  Cc: netdev, Vadim Fedorenko, Frantisek Krenzelok, Kuniyuki Iwashima,
	Apoorv Kothari, Boris Pismenny, John Fastabend, Shuah Khan,
	linux-kselftest, Gal Pressman, Marcel Holtmann

On Mon, 13 Mar 2023 16:41:36 +0100 Sabrina Dubroca wrote:
> > > Yes, I was looking into that earlier this week. I think we could reuse
> > > a similar mechanism for rekeying. tls_dev_add takes tcp_sk->write_seq,
> > > we could have a tls_dev_rekey op passing the new key and new write_seq
> > > to the driver. I think we can also reuse the ->eor trick from
> > > tls_set_device_offload, and we wouldn't have to look at
> > > skb->decrypted. Close and push the current SW record, mark ->eor, pass
> > > write_seq to the driver along with the key. Also pretty close to what
> > > tls_device_resync_tx does.  
> > 
> > That sounds like you'd expose the rekeying logic to the drivers?
> > New op, having to track seq#...  
> 
> Well, we have to call into the drivers to install the key, whether
> that's a new rekey op, or adding an update argument to ->tls_dev_add,
> or letting the driver guess that it's a rekey (or ignore that and just
> install the key if rekey vs initial key isn't a meaningful
> distinction).
> 
> We already feed drivers the seq# with ->tls_dev_add, so passing it for
> rekeys as well is not a big change.
> 
> Does that seem problematic? Adding a rekey op seemed more natural to
> me than simply using the existing _del + _add ops, but maybe we can
> get away with just using those two ops.

Theoretically a rekey op is nicer and cleaner. Practically the quality
of the driver implementations will vary wildly*, and it's a significant
time investment to review all of them. So for non-technical reasons my
intuition is that we'd deliver a better overall user experience if we
handled the rekey entirely in the core.

Wait for old key to no longer be needed, _del + _add, start using the
offload again.

* One vendor submitted a driver claiming support for TLS 1.3, when 
  TLS 1.3 offload was rejected by the core. So this is the level of
  testing and diligence we're working with :(

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next v2 0/5] tls: implement key updates for TLS1.3
  2023-03-13 18:35                     ` Jakub Kicinski
@ 2023-03-22 16:10                       ` Sabrina Dubroca
  2023-03-22 19:43                         ` Jakub Kicinski
  0 siblings, 1 reply; 21+ messages in thread
From: Sabrina Dubroca @ 2023-03-22 16:10 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, Vadim Fedorenko, Frantisek Krenzelok, Kuniyuki Iwashima,
	Apoorv Kothari, Boris Pismenny, John Fastabend, Shuah Khan,
	linux-kselftest, Gal Pressman, Marcel Holtmann

2023-03-13, 11:35:10 -0700, Jakub Kicinski wrote:
> On Mon, 13 Mar 2023 16:41:36 +0100 Sabrina Dubroca wrote:
> > > > Yes, I was looking into that earlier this week. I think we could reuse
> > > > a similar mechanism for rekeying. tls_dev_add takes tcp_sk->write_seq,
> > > > we could have a tls_dev_rekey op passing the new key and new write_seq
> > > > to the driver. I think we can also reuse the ->eor trick from
> > > > tls_set_device_offload, and we wouldn't have to look at
> > > > skb->decrypted. Close and push the current SW record, mark ->eor, pass
> > > > write_seq to the driver along with the key. Also pretty close to what
> > > > tls_device_resync_tx does.  
> > > 
> > > That sounds like you'd expose the rekeying logic to the drivers?
> > > New op, having to track seq#...  
> > 
> > Well, we have to call into the drivers to install the key, whether
> > that's a new rekey op, or adding an update argument to ->tls_dev_add,
> > or letting the driver guess that it's a rekey (or ignore that and just
> > install the key if rekey vs initial key isn't a meaningful
> > distinction).
> > 
> > We already feed drivers the seq# with ->tls_dev_add, so passing it for
> > rekeys as well is not a big change.
> > 
> > Does that seem problematic? Adding a rekey op seemed more natural to
> > me than simply using the existing _del + _add ops, but maybe we can
> > get away with just using those two ops.
> 
> Theoretically a rekey op is nicer and cleaner. Practically the quality
> of the driver implementations will vary wildly*, and it's a significant
> time investment to review all of them. So for non-technical reasons my
> intuition is that we'd deliver a better overall user experience if we
> handled the rekey entirely in the core.
> 
> Wait for old key to no longer be needed, _del + _add, start using the
> offload again.
> 
> * One vendor submitted a driver claiming support for TLS 1.3, when 
>   TLS 1.3 offload was rejected by the core. So this is the level of
>   testing and diligence we're working with :(

:(

Ok, _del + _add then.


I went over the thread to summarize what we've come up with so far:

RX
 - The existing SW path will handle all records between the KeyUpdate
   message signaling the change of key and the new key becoming known
   to the kernel -- those will be queued encrypted, and decrypted in
   SW as they are read by userspace (once the key is provided, ie same
   as this patchset)
 - Call ->tls_dev_del + ->tls_dev_add immediately during
   setsockopt(TLS_RX)

TX
 - After setsockopt(TLS_TX), switch to the existing SW path (not the
   current device_fallback) until we're able to re-enable HW offload
   - tls_device_{sendmsg,sendpage} will call into
     tls_sw_{sendmsg,sendpage} under lock_sock to avoid changing
     socket ops during the rekey while another thread might be waiting
     on the lock
 - We only re-enable HW offload (call ->tls_dev_add to install the new
   key in HW) once all records sent with the old key have been
   ACKed. At this point, all unacked records are SW-encrypted with the
   new key, and the old key is unused by both HW and retransmissions.
   - If there are no unacked records when userspace does
     setsockopt(TLS_TX), we can (try to) install the new key in HW
     immediately.
   - If yet another key has been provided via setsockopt(TLS_TX), we
     don't install intermediate keys, only the latest.
   - TCP notifies ktls of ACKs via the icsk_clean_acked callback. In
     case of a rekey, tls_icsk_clean_acked will record when all data
     sent with the most recent past key has been sent. The next call
     to sendmsg/sendpage will install the new key in HW.
   - We close and push the current SW record before reenabling
     offload.

If ->tls_dev_add fails to install the new key in HW, we stay in SW
mode. We can add a counter to keep track of this.


In addition:

Because we can't change socket ops during a rekey, we'll also have to
modify do_tls_setsockopt_conf to check ctx->tx_conf and only call
either tls_set_device_offload or tls_set_sw_offload. RX already uses
the same ops for both TLS_HW and TLS_SW, so we could switch between HW
and SW mode on rekey.

An alternative would be to have a common sendmsg/sendpage which locks
the socket and then calls the correct implementation. We'll need that
anyway for the offload under rekey case, so that would only add a test
to the SW path's ops (compared to the current code). That should allow
us to make build_protos a lot simpler.

-- 
Sabrina


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next v2 0/5] tls: implement key updates for TLS1.3
  2023-03-22 16:10                       ` Sabrina Dubroca
@ 2023-03-22 19:43                         ` Jakub Kicinski
  0 siblings, 0 replies; 21+ messages in thread
From: Jakub Kicinski @ 2023-03-22 19:43 UTC (permalink / raw)
  To: Sabrina Dubroca
  Cc: netdev, Vadim Fedorenko, Frantisek Krenzelok, Kuniyuki Iwashima,
	Apoorv Kothari, Boris Pismenny, John Fastabend, Shuah Khan,
	linux-kselftest, Gal Pressman, Marcel Holtmann

On Wed, 22 Mar 2023 17:10:25 +0100 Sabrina Dubroca wrote:
> > Theoretically a rekey op is nicer and cleaner. Practically the quality
> > of the driver implementations will vary wildly*, and it's a significant
> > time investment to review all of them. So for non-technical reasons my
> > intuition is that we'd deliver a better overall user experience if we
> > handled the rekey entirely in the core.
> > 
> > Wait for old key to no longer be needed, _del + _add, start using the
> > offload again.
> > 
> > * One vendor submitted a driver claiming support for TLS 1.3, when 
> >   TLS 1.3 offload was rejected by the core. So this is the level of
> >   testing and diligence we're working with :(  
> 
> :(
> 
> Ok, _del + _add then.
> 
> 
> I went over the thread to summarize what we've come up with so far:
> 
> RX
>  - The existing SW path will handle all records between the KeyUpdate
>    message signaling the change of key and the new key becoming known
>    to the kernel -- those will be queued encrypted, and decrypted in
>    SW as they are read by userspace (once the key is provided, ie same
>    as this patchset)
>  - Call ->tls_dev_del + ->tls_dev_add immediately during
>    setsockopt(TLS_RX)
> 
> TX
>  - After setsockopt(TLS_TX), switch to the existing SW path (not the
>    current device_fallback) until we're able to re-enable HW offload
>    - tls_device_{sendmsg,sendpage} will call into
>      tls_sw_{sendmsg,sendpage} under lock_sock to avoid changing
>      socket ops during the rekey while another thread might be waiting
>      on the lock
>  - We only re-enable HW offload (call ->tls_dev_add to install the new
>    key in HW) once all records sent with the old key have been
>    ACKed. At this point, all unacked records are SW-encrypted with the
>    new key, and the old key is unused by both HW and retransmissions.
>    - If there are no unacked records when userspace does
>      setsockopt(TLS_TX), we can (try to) install the new key in HW
>      immediately.
>    - If yet another key has been provided via setsockopt(TLS_TX), we
>      don't install intermediate keys, only the latest.
>    - TCP notifies ktls of ACKs via the icsk_clean_acked callback. In
>      case of a rekey, tls_icsk_clean_acked will record when all data
>      sent with the most recent past key has been sent. The next call
>      to sendmsg/sendpage will install the new key in HW.
>    - We close and push the current SW record before reenabling
>      offload.
> 
> If ->tls_dev_add fails to install the new key in HW, we stay in SW
> mode. We can add a counter to keep track of this.

SG!

> In addition:
> 
> Because we can't change socket ops during a rekey, we'll also have to
> modify do_tls_setsockopt_conf to check ctx->tx_conf and only call
> either tls_set_device_offload or tls_set_sw_offload. RX already uses
> the same ops for both TLS_HW and TLS_SW, so we could switch between HW
> and SW mode on rekey.
> 
> An alternative would be to have a common sendmsg/sendpage which locks
> the socket and then calls the correct implementation. We'll need that
> anyway for the offload under rekey case, so that would only add a test
> to the SW path's ops (compared to the current code). That should allow
> us to make build_protos a lot simpler.

No preference assuming perf is the same.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2023-03-22 19:43 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-14 11:17 [PATCH net-next v2 0/5] tls: implement key updates for TLS1.3 Sabrina Dubroca
2023-02-14 11:17 ` [PATCH net-next v2 1/5] tls: remove tls_context argument from tls_set_sw_offload Sabrina Dubroca
2023-02-14 11:17 ` [PATCH net-next v2 2/5] tls: block decryption when a rekey is pending Sabrina Dubroca
2023-02-15  5:09   ` Jakub Kicinski
2023-02-15 17:37     ` Sabrina Dubroca
2023-02-14 11:17 ` [PATCH net-next v2 3/5] tls: implement rekey for TLS1.3 Sabrina Dubroca
2023-02-14 11:17 ` [PATCH net-next v2 4/5] selftests: tls: add key_generation argument to tls_crypto_info_init Sabrina Dubroca
2023-02-14 11:17 ` [PATCH net-next v2 5/5] selftests: tls: add rekey tests Sabrina Dubroca
2023-02-15  5:08 ` [PATCH net-next v2 0/5] tls: implement key updates for TLS1.3 Jakub Kicinski
2023-02-15 17:29   ` Sabrina Dubroca
2023-02-15 19:10     ` Jakub Kicinski
2023-02-15 23:23       ` Sabrina Dubroca
2023-02-16  3:57         ` Jakub Kicinski
2023-02-16 16:23           ` Sabrina Dubroca
2023-02-22  3:19             ` Jakub Kicinski
2023-02-23 16:27               ` Sabrina Dubroca
2023-02-23 17:29                 ` Jakub Kicinski
2023-03-13 15:41                   ` Sabrina Dubroca
2023-03-13 18:35                     ` Jakub Kicinski
2023-03-22 16:10                       ` Sabrina Dubroca
2023-03-22 19:43                         ` Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).