All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf v4 00/14] sockmap/tls fixes
@ 2019-07-19 17:29 Jakub Kicinski
  2019-07-19 17:29 ` [PATCH bpf v4 01/14] net/tls: don't arm strparser immediately in tls_set_sw_offload() Jakub Kicinski
                   ` (14 more replies)
  0 siblings, 15 replies; 19+ messages in thread
From: Jakub Kicinski @ 2019-07-19 17:29 UTC (permalink / raw)
  To: john.fastabend, alexei.starovoitov, daniel
  Cc: edumazet, netdev, bpf, Jakub Kicinski

John says:

Resolve a series of splats discovered by syzbot and an unhash
TLS issue noted by Eric Dumazet.

The main issues revolved around interaction between TLS and
sockmap tear down. TLS and sockmap could both reset sk->prot
ops creating a condition where a close or unhash op could be
called forever. A rare race condition resulting from a missing
rcu sync operation was causing a use after free. Then on the
TLS side dropping the sock lock and re-acquiring it during the
close op could hang. Finally, sockmap must be deployed before
tls for current stack assumptions to be met. This is enforced
now. A feature series can enable it.

To fix this first refactor TLS code so the lock is held for the
entire teardown operation. Then add an unhash callback to ensure
TLS can not transition from ESTABLISHED to LISTEN state. This
transition is a similar bug to the one found and fixed previously
in sockmap. Then apply three fixes to sockmap to fix up races
on tear down around map free and close. Finally, if sockmap
is destroyed before TLS we add a new ULP op update to inform
the TLS stack it should not call sockmap ops. This last one
appears to be the most commonly found issue from syzbot.

v4:
 - fix some use after frees;
 - disable disconnect work for offload (ctx lifetime is much
   more complex);
 - remove some of the dead code which made it hard to understand
   (for me) that things work correctly (e.g. the checks TLS is
   the top ULP);
 - add selftets.

Jakub Kicinski (7):
  net/tls: don't arm strparser immediately in tls_set_sw_offload()
  net/tls: don't call tls_sk_proto_close for hw record offload
  selftests/tls: add a test for ULP but no keys
  selftests/tls: test error codes around TLS ULP installation
  selftests/tls: add a bidirectional test
  selftests/tls: close the socket with open record
  selftests/tls: add shutdown tests

John Fastabend (7):
  net/tls: remove close callback sock unlock/lock around TX work flush
  net/tls: remove sock unlock/lock around strp_done()
  net/tls: fix transition through disconnect with close
  bpf: sockmap, sock_map_delete needs to use xchg
  bpf: sockmap, synchronize_rcu before free'ing map
  bpf: sockmap, only create entry if ulp is not already enabled
  bpf: sockmap/tls, close can race with map free

 Documentation/networking/tls-offload.rst |   6 +
 include/linux/skmsg.h                    |   8 +-
 include/net/tcp.h                        |   3 +
 include/net/tls.h                        |  15 +-
 net/core/skmsg.c                         |   4 +-
 net/core/sock_map.c                      |  19 ++-
 net/ipv4/tcp_ulp.c                       |  13 ++
 net/tls/tls_main.c                       | 142 +++++++++++++----
 net/tls/tls_sw.c                         |  83 +++++++---
 tools/testing/selftests/net/tls.c        | 194 +++++++++++++++++++++++
 10 files changed, 419 insertions(+), 68 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH bpf v4 01/14] net/tls: don't arm strparser immediately in tls_set_sw_offload()
  2019-07-19 17:29 [PATCH bpf v4 00/14] sockmap/tls fixes Jakub Kicinski
@ 2019-07-19 17:29 ` Jakub Kicinski
  2019-07-19 17:29 ` [PATCH bpf v4 02/14] net/tls: don't call tls_sk_proto_close for hw record offload Jakub Kicinski
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 19+ messages in thread
From: Jakub Kicinski @ 2019-07-19 17:29 UTC (permalink / raw)
  To: john.fastabend, alexei.starovoitov, daniel
  Cc: edumazet, netdev, bpf, Jakub Kicinski, Dirk van der Merwe

In tls_set_device_offload_rx() we prepare the software context
for RX fallback and proceed to add the connection to the device.
Unfortunately, software context prep includes arming strparser
so in case of a later error we have to release the socket lock
to call strp_done().

In preparation for not releasing the socket lock half way through
callbacks move arming strparser into a separate function.
Following patches will make use of that.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
---
 include/net/tls.h    |  1 +
 net/tls/tls_device.c |  1 +
 net/tls/tls_main.c   |  8 +++++---
 net/tls/tls_sw.c     | 19 ++++++++++++-------
 4 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/include/net/tls.h b/include/net/tls.h
index 584609174fe0..43f551cd508b 100644
--- a/include/net/tls.h
+++ b/include/net/tls.h
@@ -355,6 +355,7 @@ int tls_sk_attach(struct sock *sk, int optname, char __user *optval,
 		  unsigned int optlen);
 
 int tls_set_sw_offload(struct sock *sk, struct tls_context *ctx, int tx);
+void tls_sw_strparser_arm(struct sock *sk, struct tls_context *ctx);
 int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
 int tls_sw_sendpage(struct sock *sk, struct page *page,
 		    int offset, size_t size, int flags);
diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
index 7c0b2b778703..4d67d72f007c 100644
--- a/net/tls/tls_device.c
+++ b/net/tls/tls_device.c
@@ -1045,6 +1045,7 @@ int tls_set_device_offload_rx(struct sock *sk, struct tls_context *ctx)
 	rc = tls_set_sw_offload(sk, ctx, 0);
 	if (rc)
 		goto release_ctx;
+	tls_sw_strparser_arm(sk, ctx);
 
 	rc = netdev->tlsdev_ops->tls_dev_add(netdev, sk, TLS_OFFLOAD_CTX_DIR_RX,
 					     &ctx->crypto_recv.info,
diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index 4674e57e66b0..85a9d7d57b32 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -526,6 +526,8 @@ static int do_tls_setsockopt_conf(struct sock *sk, char __user *optval,
 		{
 #endif
 			rc = tls_set_sw_offload(sk, ctx, 1);
+			if (rc)
+				goto err_crypto_info;
 			conf = TLS_SW;
 		}
 	} else {
@@ -537,13 +539,13 @@ static int do_tls_setsockopt_conf(struct sock *sk, char __user *optval,
 		{
 #endif
 			rc = tls_set_sw_offload(sk, ctx, 0);
+			if (rc)
+				goto err_crypto_info;
+			tls_sw_strparser_arm(sk, ctx);
 			conf = TLS_SW;
 		}
 	}
 
-	if (rc)
-		goto err_crypto_info;
-
 	if (tx)
 		ctx->tx_conf = conf;
 	else
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 53b4ad94e74a..f58a8ffc2a9c 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -2160,6 +2160,18 @@ void tls_sw_write_space(struct sock *sk, struct tls_context *ctx)
 	}
 }
 
+void tls_sw_strparser_arm(struct sock *sk, struct tls_context *tls_ctx)
+{
+	struct tls_sw_context_rx *rx_ctx = tls_sw_ctx_rx(tls_ctx);
+
+	write_lock_bh(&sk->sk_callback_lock);
+	rx_ctx->saved_data_ready = sk->sk_data_ready;
+	sk->sk_data_ready = tls_data_ready;
+	write_unlock_bh(&sk->sk_callback_lock);
+
+	strp_check_rcv(&rx_ctx->strp);
+}
+
 int tls_set_sw_offload(struct sock *sk, struct tls_context *ctx, int tx)
 {
 	struct tls_context *tls_ctx = tls_get_ctx(sk);
@@ -2357,13 +2369,6 @@ int tls_set_sw_offload(struct sock *sk, struct tls_context *ctx, int tx)
 		cb.parse_msg = tls_read_size;
 
 		strp_init(&sw_ctx_rx->strp, sk, &cb);
-
-		write_lock_bh(&sk->sk_callback_lock);
-		sw_ctx_rx->saved_data_ready = sk->sk_data_ready;
-		sk->sk_data_ready = tls_data_ready;
-		write_unlock_bh(&sk->sk_callback_lock);
-
-		strp_check_rcv(&sw_ctx_rx->strp);
 	}
 
 	goto out;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf v4 02/14] net/tls: don't call tls_sk_proto_close for hw record offload
  2019-07-19 17:29 [PATCH bpf v4 00/14] sockmap/tls fixes Jakub Kicinski
  2019-07-19 17:29 ` [PATCH bpf v4 01/14] net/tls: don't arm strparser immediately in tls_set_sw_offload() Jakub Kicinski
@ 2019-07-19 17:29 ` Jakub Kicinski
  2019-07-19 17:29 ` [PATCH bpf v4 03/14] net/tls: remove close callback sock unlock/lock around TX work flush Jakub Kicinski
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 19+ messages in thread
From: Jakub Kicinski @ 2019-07-19 17:29 UTC (permalink / raw)
  To: john.fastabend, alexei.starovoitov, daniel
  Cc: edumazet, netdev, bpf, Jakub Kicinski, Dirk van der Merwe

The deprecated TOE offload doesn't actually do anything in
tls_sk_proto_close() - all TLS code is skipped and context
not freed. Remove the callback to make it easier to refactor
tls_sk_proto_close().

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
---
 net/tls/tls_main.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index 85a9d7d57b32..7ab682ed99fa 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -271,9 +271,6 @@ static void tls_sk_proto_close(struct sock *sk, long timeout)
 	lock_sock(sk);
 	sk_proto_close = ctx->sk_proto_close;
 
-	if (ctx->tx_conf == TLS_HW_RECORD && ctx->rx_conf == TLS_HW_RECORD)
-		goto skip_tx_cleanup;
-
 	if (ctx->tx_conf == TLS_BASE && ctx->rx_conf == TLS_BASE) {
 		free_ctx = true;
 		goto skip_tx_cleanup;
@@ -766,7 +763,6 @@ static void build_protos(struct proto prot[TLS_NUM_CONFIG][TLS_NUM_CONFIG],
 	prot[TLS_HW_RECORD][TLS_HW_RECORD] = *base;
 	prot[TLS_HW_RECORD][TLS_HW_RECORD].hash		= tls_hw_hash;
 	prot[TLS_HW_RECORD][TLS_HW_RECORD].unhash	= tls_hw_unhash;
-	prot[TLS_HW_RECORD][TLS_HW_RECORD].close	= tls_sk_proto_close;
 }
 
 static int tls_init(struct sock *sk)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf v4 03/14] net/tls: remove close callback sock unlock/lock around TX work flush
  2019-07-19 17:29 [PATCH bpf v4 00/14] sockmap/tls fixes Jakub Kicinski
  2019-07-19 17:29 ` [PATCH bpf v4 01/14] net/tls: don't arm strparser immediately in tls_set_sw_offload() Jakub Kicinski
  2019-07-19 17:29 ` [PATCH bpf v4 02/14] net/tls: don't call tls_sk_proto_close for hw record offload Jakub Kicinski
@ 2019-07-19 17:29 ` Jakub Kicinski
  2019-07-19 17:29 ` [PATCH bpf v4 04/14] net/tls: remove sock unlock/lock around strp_done() Jakub Kicinski
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 19+ messages in thread
From: Jakub Kicinski @ 2019-07-19 17:29 UTC (permalink / raw)
  To: john.fastabend, alexei.starovoitov, daniel
  Cc: edumazet, netdev, bpf, Jakub Kicinski, Dirk van der Merwe

From: John Fastabend <john.fastabend@gmail.com>

The tls close() callback currently drops the sock lock, makes a
cancel_delayed_work_sync() call, and then relocks the sock.

By restructuring the code we can avoid droping lock and then
reclaiming it. To simplify this we do the following,

 tls_sk_proto_close
 set_bit(CLOSING)
 set_bit(SCHEDULE)
 cancel_delay_work_sync() <- cancel workqueue
 lock_sock(sk)
 ...
 release_sock(sk)
 strp_done()

Setting the CLOSING bit prevents the SCHEDULE bit from being
cleared by any workqueue items e.g. if one happens to be
scheduled and run between when we set SCHEDULE bit and cancel
work. Then because SCHEDULE bit is set now no new work will
be scheduled.

Tested with net selftests and bpf selftests.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
---
 include/net/tls.h  |  2 ++
 net/tls/tls_main.c |  3 +++
 net/tls/tls_sw.c   | 24 +++++++++++++++++-------
 3 files changed, 22 insertions(+), 7 deletions(-)

diff --git a/include/net/tls.h b/include/net/tls.h
index 43f551cd508b..d4276cb6de53 100644
--- a/include/net/tls.h
+++ b/include/net/tls.h
@@ -162,6 +162,7 @@ struct tls_sw_context_tx {
 	int async_capable;
 
 #define BIT_TX_SCHEDULED	0
+#define BIT_TX_CLOSING		1
 	unsigned long tx_bitmask;
 };
 
@@ -360,6 +361,7 @@ int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
 int tls_sw_sendpage(struct sock *sk, struct page *page,
 		    int offset, size_t size, int flags);
 void tls_sw_close(struct sock *sk, long timeout);
+void tls_sw_cancel_work_tx(struct tls_context *tls_ctx);
 void tls_sw_free_resources_tx(struct sock *sk);
 void tls_sw_free_resources_rx(struct sock *sk);
 void tls_sw_release_resources_rx(struct sock *sk);
diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index 7ab682ed99fa..5c29b410cf7d 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -268,6 +268,9 @@ static void tls_sk_proto_close(struct sock *sk, long timeout)
 	void (*sk_proto_close)(struct sock *sk, long timeout);
 	bool free_ctx = false;
 
+	if (ctx->tx_conf == TLS_SW)
+		tls_sw_cancel_work_tx(ctx);
+
 	lock_sock(sk);
 	sk_proto_close = ctx->sk_proto_close;
 
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index f58a8ffc2a9c..38c0e53c727d 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -2054,6 +2054,15 @@ static void tls_data_ready(struct sock *sk)
 	}
 }
 
+void tls_sw_cancel_work_tx(struct tls_context *tls_ctx)
+{
+	struct tls_sw_context_tx *ctx = tls_sw_ctx_tx(tls_ctx);
+
+	set_bit(BIT_TX_CLOSING, &ctx->tx_bitmask);
+	set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask);
+	cancel_delayed_work_sync(&ctx->tx_work.work);
+}
+
 void tls_sw_free_resources_tx(struct sock *sk)
 {
 	struct tls_context *tls_ctx = tls_get_ctx(sk);
@@ -2065,11 +2074,6 @@ void tls_sw_free_resources_tx(struct sock *sk)
 	if (atomic_read(&ctx->encrypt_pending))
 		crypto_wait_req(-EINPROGRESS, &ctx->async_wait);
 
-	release_sock(sk);
-	cancel_delayed_work_sync(&ctx->tx_work.work);
-	lock_sock(sk);
-
-	/* Tx whatever records we can transmit and abandon the rest */
 	tls_tx_records(sk, -1);
 
 	/* Free up un-sent records in tx_list. First, free
@@ -2137,11 +2141,17 @@ static void tx_work_handler(struct work_struct *work)
 					       struct tx_work, work);
 	struct sock *sk = tx_work->sk;
 	struct tls_context *tls_ctx = tls_get_ctx(sk);
-	struct tls_sw_context_tx *ctx = tls_sw_ctx_tx(tls_ctx);
+	struct tls_sw_context_tx *ctx;
 
-	if (!test_and_clear_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask))
+	if (unlikely(!tls_ctx))
 		return;
 
+	ctx = tls_sw_ctx_tx(tls_ctx);
+	if (test_bit(BIT_TX_CLOSING, &ctx->tx_bitmask))
+		return;
+
+	if (!test_and_clear_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask))
+		return;
 	lock_sock(sk);
 	tls_tx_records(sk, -1);
 	release_sock(sk);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf v4 04/14] net/tls: remove sock unlock/lock around strp_done()
  2019-07-19 17:29 [PATCH bpf v4 00/14] sockmap/tls fixes Jakub Kicinski
                   ` (2 preceding siblings ...)
  2019-07-19 17:29 ` [PATCH bpf v4 03/14] net/tls: remove close callback sock unlock/lock around TX work flush Jakub Kicinski
@ 2019-07-19 17:29 ` Jakub Kicinski
  2019-07-19 17:29 ` [PATCH bpf v4 05/14] net/tls: fix transition through disconnect with close Jakub Kicinski
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 19+ messages in thread
From: Jakub Kicinski @ 2019-07-19 17:29 UTC (permalink / raw)
  To: john.fastabend, alexei.starovoitov, daniel
  Cc: edumazet, netdev, bpf, Jakub Kicinski, Dirk van der Merwe

From: John Fastabend <john.fastabend@gmail.com>

The tls close() callback currently drops the sock lock to call
strp_done(). Split up the RX cleanup into stopping the strparser
and releasing most resources, syncing strparser and finally
freeing the context.

To avoid the need for a strp_done() call on the cleanup path
of device offload make sure we don't arm the strparser until
we are sure init will be successful.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
---
 include/net/tls.h    |  7 ++---
 net/tls/tls_device.c |  1 -
 net/tls/tls_main.c   | 61 ++++++++++++++++++++++----------------------
 net/tls/tls_sw.c     | 40 +++++++++++++++++++++--------
 4 files changed, 64 insertions(+), 45 deletions(-)

diff --git a/include/net/tls.h b/include/net/tls.h
index d4276cb6de53..235508e35fd4 100644
--- a/include/net/tls.h
+++ b/include/net/tls.h
@@ -107,9 +107,7 @@ struct tls_device {
 enum {
 	TLS_BASE,
 	TLS_SW,
-#ifdef CONFIG_TLS_DEVICE
 	TLS_HW,
-#endif
 	TLS_HW_RECORD,
 	TLS_NUM_CONFIG,
 };
@@ -357,14 +355,17 @@ int tls_sk_attach(struct sock *sk, int optname, char __user *optval,
 
 int tls_set_sw_offload(struct sock *sk, struct tls_context *ctx, int tx);
 void tls_sw_strparser_arm(struct sock *sk, struct tls_context *ctx);
+void tls_sw_strparser_done(struct tls_context *tls_ctx);
 int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
 int tls_sw_sendpage(struct sock *sk, struct page *page,
 		    int offset, size_t size, int flags);
 void tls_sw_close(struct sock *sk, long timeout);
 void tls_sw_cancel_work_tx(struct tls_context *tls_ctx);
-void tls_sw_free_resources_tx(struct sock *sk);
+void tls_sw_release_resources_tx(struct sock *sk);
+void tls_sw_free_ctx_tx(struct tls_context *tls_ctx);
 void tls_sw_free_resources_rx(struct sock *sk);
 void tls_sw_release_resources_rx(struct sock *sk);
+void tls_sw_free_ctx_rx(struct tls_context *tls_ctx);
 int tls_sw_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
 		   int nonblock, int flags, int *addr_len);
 bool tls_sw_stream_read(const struct sock *sk);
diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
index 4d67d72f007c..7c0b2b778703 100644
--- a/net/tls/tls_device.c
+++ b/net/tls/tls_device.c
@@ -1045,7 +1045,6 @@ int tls_set_device_offload_rx(struct sock *sk, struct tls_context *ctx)
 	rc = tls_set_sw_offload(sk, ctx, 0);
 	if (rc)
 		goto release_ctx;
-	tls_sw_strparser_arm(sk, ctx);
 
 	rc = netdev->tlsdev_ops->tls_dev_add(netdev, sk, TLS_OFFLOAD_CTX_DIR_RX,
 					     &ctx->crypto_recv.info,
diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index 5c29b410cf7d..d152a00a7a27 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -261,24 +261,9 @@ void tls_ctx_free(struct tls_context *ctx)
 	kfree(ctx);
 }
 
-static void tls_sk_proto_close(struct sock *sk, long timeout)
+static void tls_sk_proto_cleanup(struct sock *sk,
+				 struct tls_context *ctx, long timeo)
 {
-	struct tls_context *ctx = tls_get_ctx(sk);
-	long timeo = sock_sndtimeo(sk, 0);
-	void (*sk_proto_close)(struct sock *sk, long timeout);
-	bool free_ctx = false;
-
-	if (ctx->tx_conf == TLS_SW)
-		tls_sw_cancel_work_tx(ctx);
-
-	lock_sock(sk);
-	sk_proto_close = ctx->sk_proto_close;
-
-	if (ctx->tx_conf == TLS_BASE && ctx->rx_conf == TLS_BASE) {
-		free_ctx = true;
-		goto skip_tx_cleanup;
-	}
-
 	if (unlikely(sk->sk_write_pending) &&
 	    !wait_on_pending_writer(sk, &timeo))
 		tls_handle_open_record(sk, 0);
@@ -287,7 +272,7 @@ static void tls_sk_proto_close(struct sock *sk, long timeout)
 	if (ctx->tx_conf == TLS_SW) {
 		kfree(ctx->tx.rec_seq);
 		kfree(ctx->tx.iv);
-		tls_sw_free_resources_tx(sk);
+		tls_sw_release_resources_tx(sk);
 #ifdef CONFIG_TLS_DEVICE
 	} else if (ctx->tx_conf == TLS_HW) {
 		tls_device_free_resources_tx(sk);
@@ -295,26 +280,40 @@ static void tls_sk_proto_close(struct sock *sk, long timeout)
 	}
 
 	if (ctx->rx_conf == TLS_SW)
-		tls_sw_free_resources_rx(sk);
+		tls_sw_release_resources_rx(sk);
 
 #ifdef CONFIG_TLS_DEVICE
 	if (ctx->rx_conf == TLS_HW)
 		tls_device_offload_cleanup_rx(sk);
-
-	if (ctx->tx_conf != TLS_HW && ctx->rx_conf != TLS_HW) {
-#else
-	{
 #endif
-		tls_ctx_free(ctx);
-		ctx = NULL;
-	}
+}
+
+static void tls_sk_proto_close(struct sock *sk, long timeout)
+{
+	void (*sk_proto_close)(struct sock *sk, long timeout);
+	struct tls_context *ctx = tls_get_ctx(sk);
+	long timeo = sock_sndtimeo(sk, 0);
+	bool free_ctx;
+
+	if (ctx->tx_conf == TLS_SW)
+		tls_sw_cancel_work_tx(ctx);
+
+	lock_sock(sk);
+	free_ctx = ctx->tx_conf != TLS_HW && ctx->rx_conf != TLS_HW;
+	sk_proto_close = ctx->sk_proto_close;
+
+	if (ctx->tx_conf != TLS_BASE || ctx->rx_conf != TLS_BASE)
+		tls_sk_proto_cleanup(sk, ctx, timeo);
 
-skip_tx_cleanup:
 	release_sock(sk);
+	if (ctx->tx_conf == TLS_SW)
+		tls_sw_free_ctx_tx(ctx);
+	if (ctx->rx_conf == TLS_SW || ctx->rx_conf == TLS_HW)
+		tls_sw_strparser_done(ctx);
+	if (ctx->rx_conf == TLS_SW)
+		tls_sw_free_ctx_rx(ctx);
 	sk_proto_close(sk, timeout);
-	/* free ctx for TLS_HW_RECORD, used by tcp_set_state
-	 * for sk->sk_prot->unhash [tls_hw_unhash]
-	 */
+
 	if (free_ctx)
 		tls_ctx_free(ctx);
 }
@@ -541,9 +540,9 @@ static int do_tls_setsockopt_conf(struct sock *sk, char __user *optval,
 			rc = tls_set_sw_offload(sk, ctx, 0);
 			if (rc)
 				goto err_crypto_info;
-			tls_sw_strparser_arm(sk, ctx);
 			conf = TLS_SW;
 		}
+		tls_sw_strparser_arm(sk, ctx);
 	}
 
 	if (tx)
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 38c0e53c727d..91d21b048a9b 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -2063,7 +2063,7 @@ void tls_sw_cancel_work_tx(struct tls_context *tls_ctx)
 	cancel_delayed_work_sync(&ctx->tx_work.work);
 }
 
-void tls_sw_free_resources_tx(struct sock *sk)
+void tls_sw_release_resources_tx(struct sock *sk)
 {
 	struct tls_context *tls_ctx = tls_get_ctx(sk);
 	struct tls_sw_context_tx *ctx = tls_sw_ctx_tx(tls_ctx);
@@ -2096,6 +2096,11 @@ void tls_sw_free_resources_tx(struct sock *sk)
 
 	crypto_free_aead(ctx->aead_send);
 	tls_free_open_rec(sk);
+}
+
+void tls_sw_free_ctx_tx(struct tls_context *tls_ctx)
+{
+	struct tls_sw_context_tx *ctx = tls_sw_ctx_tx(tls_ctx);
 
 	kfree(ctx);
 }
@@ -2114,25 +2119,40 @@ void tls_sw_release_resources_rx(struct sock *sk)
 		skb_queue_purge(&ctx->rx_list);
 		crypto_free_aead(ctx->aead_recv);
 		strp_stop(&ctx->strp);
-		write_lock_bh(&sk->sk_callback_lock);
-		sk->sk_data_ready = ctx->saved_data_ready;
-		write_unlock_bh(&sk->sk_callback_lock);
-		release_sock(sk);
-		strp_done(&ctx->strp);
-		lock_sock(sk);
+		/* If tls_sw_strparser_arm() was not called (cleanup paths)
+		 * we still want to strp_stop(), but sk->sk_data_ready was
+		 * never swapped.
+		 */
+		if (ctx->saved_data_ready) {
+			write_lock_bh(&sk->sk_callback_lock);
+			sk->sk_data_ready = ctx->saved_data_ready;
+			write_unlock_bh(&sk->sk_callback_lock);
+		}
 	}
 }
 
-void tls_sw_free_resources_rx(struct sock *sk)
+void tls_sw_strparser_done(struct tls_context *tls_ctx)
 {
-	struct tls_context *tls_ctx = tls_get_ctx(sk);
 	struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx);
 
-	tls_sw_release_resources_rx(sk);
+	strp_done(&ctx->strp);
+}
+
+void tls_sw_free_ctx_rx(struct tls_context *tls_ctx)
+{
+	struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx);
 
 	kfree(ctx);
 }
 
+void tls_sw_free_resources_rx(struct sock *sk)
+{
+	struct tls_context *tls_ctx = tls_get_ctx(sk);
+
+	tls_sw_release_resources_rx(sk);
+	tls_sw_free_ctx_rx(tls_ctx);
+}
+
 /* The work handler to transmitt the encrypted records in tx_list */
 static void tx_work_handler(struct work_struct *work)
 {
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf v4 05/14] net/tls: fix transition through disconnect with close
  2019-07-19 17:29 [PATCH bpf v4 00/14] sockmap/tls fixes Jakub Kicinski
                   ` (3 preceding siblings ...)
  2019-07-19 17:29 ` [PATCH bpf v4 04/14] net/tls: remove sock unlock/lock around strp_done() Jakub Kicinski
@ 2019-07-19 17:29 ` Jakub Kicinski
  2019-07-19 17:29 ` [PATCH bpf v4 06/14] bpf: sockmap, sock_map_delete needs to use xchg Jakub Kicinski
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 19+ messages in thread
From: Jakub Kicinski @ 2019-07-19 17:29 UTC (permalink / raw)
  To: john.fastabend, alexei.starovoitov, daniel
  Cc: edumazet, netdev, bpf, Jakub Kicinski

From: John Fastabend <john.fastabend@gmail.com>

It is possible (via shutdown()) for TCP socks to go through TCP_CLOSE
state via tcp_disconnect() without actually calling tcp_close which
would then call the tls close callback. Because of this a user could
disconnect a socket then put it in a LISTEN state which would break
our assumptions about sockets always being ESTABLISHED state.

More directly because close() can call unhash() and unhash is
implemented by sockmap if a sockmap socket has TLS enabled we can
incorrectly destroy the psock from unhash() and then call its close
handler again. But because the psock (sockmap socket representation)
is already destroyed we call close handler in sk->prot. However,
in some cases (TLS BASE/BASE case) this will still point at the
sockmap close handler resulting in a circular call and crash reported
by syzbot.

To fix both above issues implement the unhash() routine for TLS.

v4:
 - add note about tls offload still needing the fix;
 - move sk_proto to the cold cache line;
 - split TX context free into "release" and "free",
   otherwise the GC work itself is in already freed
   memory;
 - more TX before RX for consistency;
 - reuse tls_ctx_free();
 - schedule the GC work after we're done with context
   to avoid UAF;
 - don't set the unhash in all modes, all modes "inherit"
   TLS_BASE's callbacks anyway;
 - disable the unhash hook for TLS_HW.

Fixes: 3c4d7559159bf ("tls: kernel TLS support")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 Documentation/networking/tls-offload.rst |  6 +++
 include/net/tls.h                        |  5 ++-
 net/tls/tls_main.c                       | 55 ++++++++++++++++++++++++
 3 files changed, 65 insertions(+), 1 deletion(-)

diff --git a/Documentation/networking/tls-offload.rst b/Documentation/networking/tls-offload.rst
index 048e5ca44824..8a1eeb393316 100644
--- a/Documentation/networking/tls-offload.rst
+++ b/Documentation/networking/tls-offload.rst
@@ -513,3 +513,9 @@ Redirects leak clear text
 
 In the RX direction, if segment has already been decrypted by the device
 and it gets redirected or mirrored - clear text will be transmitted out.
+
+shutdown() doesn't clear TLS state
+----------------------------------
+
+shutdown() system call allows for a TLS socket to be reused as a different
+connection. Offload doesn't currently handle that.
diff --git a/include/net/tls.h b/include/net/tls.h
index 235508e35fd4..9e425ac2de45 100644
--- a/include/net/tls.h
+++ b/include/net/tls.h
@@ -271,6 +271,8 @@ struct tls_context {
 	unsigned long flags;
 
 	/* cache cold stuff */
+	struct proto *sk_proto;
+
 	void (*sk_destruct)(struct sock *sk);
 	void (*sk_proto_close)(struct sock *sk, long timeout);
 
@@ -288,6 +290,8 @@ struct tls_context {
 
 	struct list_head list;
 	refcount_t refcount;
+
+	struct work_struct gc;
 };
 
 enum tls_offload_ctx_dir {
@@ -359,7 +363,6 @@ void tls_sw_strparser_done(struct tls_context *tls_ctx);
 int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
 int tls_sw_sendpage(struct sock *sk, struct page *page,
 		    int offset, size_t size, int flags);
-void tls_sw_close(struct sock *sk, long timeout);
 void tls_sw_cancel_work_tx(struct tls_context *tls_ctx);
 void tls_sw_release_resources_tx(struct sock *sk);
 void tls_sw_free_ctx_tx(struct tls_context *tls_ctx);
diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index d152a00a7a27..48f1c26459d0 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -261,6 +261,33 @@ void tls_ctx_free(struct tls_context *ctx)
 	kfree(ctx);
 }
 
+static void tls_ctx_free_deferred(struct work_struct *gc)
+{
+	struct tls_context *ctx = container_of(gc, struct tls_context, gc);
+
+	/* Ensure any remaining work items are completed. The sk will
+	 * already have lost its tls_ctx reference by the time we get
+	 * here so no xmit operation will actually be performed.
+	 */
+	if (ctx->tx_conf == TLS_SW) {
+		tls_sw_cancel_work_tx(ctx);
+		tls_sw_free_ctx_tx(ctx);
+	}
+
+	if (ctx->rx_conf == TLS_SW) {
+		tls_sw_strparser_done(ctx);
+		tls_sw_free_ctx_rx(ctx);
+	}
+
+	tls_ctx_free(ctx);
+}
+
+static void tls_ctx_free_wq(struct tls_context *ctx)
+{
+	INIT_WORK(&ctx->gc, tls_ctx_free_deferred);
+	schedule_work(&ctx->gc);
+}
+
 static void tls_sk_proto_cleanup(struct sock *sk,
 				 struct tls_context *ctx, long timeo)
 {
@@ -288,6 +315,26 @@ static void tls_sk_proto_cleanup(struct sock *sk,
 #endif
 }
 
+static void tls_sk_proto_unhash(struct sock *sk)
+{
+	struct inet_connection_sock *icsk = inet_csk(sk);
+	long timeo = sock_sndtimeo(sk, 0);
+	struct tls_context *ctx;
+
+	if (unlikely(!icsk->icsk_ulp_data)) {
+		if (sk->sk_prot->unhash)
+			sk->sk_prot->unhash(sk);
+	}
+
+	ctx = tls_get_ctx(sk);
+	tls_sk_proto_cleanup(sk, ctx, timeo);
+	icsk->icsk_ulp_data = NULL;
+
+	if (ctx->sk_proto->unhash)
+		ctx->sk_proto->unhash(sk);
+	tls_ctx_free_wq(ctx);
+}
+
 static void tls_sk_proto_close(struct sock *sk, long timeout)
 {
 	void (*sk_proto_close)(struct sock *sk, long timeout);
@@ -305,6 +352,7 @@ static void tls_sk_proto_close(struct sock *sk, long timeout)
 	if (ctx->tx_conf != TLS_BASE || ctx->rx_conf != TLS_BASE)
 		tls_sk_proto_cleanup(sk, ctx, timeo);
 
+	sk->sk_prot = ctx->sk_proto;
 	release_sock(sk);
 	if (ctx->tx_conf == TLS_SW)
 		tls_sw_free_ctx_tx(ctx);
@@ -608,6 +656,7 @@ static struct tls_context *create_ctx(struct sock *sk)
 	ctx->setsockopt = sk->sk_prot->setsockopt;
 	ctx->getsockopt = sk->sk_prot->getsockopt;
 	ctx->sk_proto_close = sk->sk_prot->close;
+	ctx->unhash = sk->sk_prot->unhash;
 	return ctx;
 }
 
@@ -731,6 +780,7 @@ static void build_protos(struct proto prot[TLS_NUM_CONFIG][TLS_NUM_CONFIG],
 	prot[TLS_BASE][TLS_BASE].setsockopt	= tls_setsockopt;
 	prot[TLS_BASE][TLS_BASE].getsockopt	= tls_getsockopt;
 	prot[TLS_BASE][TLS_BASE].close		= tls_sk_proto_close;
+	prot[TLS_BASE][TLS_BASE].unhash		= tls_sk_proto_unhash;
 
 	prot[TLS_SW][TLS_BASE] = prot[TLS_BASE][TLS_BASE];
 	prot[TLS_SW][TLS_BASE].sendmsg		= tls_sw_sendmsg;
@@ -748,16 +798,20 @@ static void build_protos(struct proto prot[TLS_NUM_CONFIG][TLS_NUM_CONFIG],
 
 #ifdef CONFIG_TLS_DEVICE
 	prot[TLS_HW][TLS_BASE] = prot[TLS_BASE][TLS_BASE];
+	prot[TLS_HW][TLS_BASE].unhash		= base->unhash;
 	prot[TLS_HW][TLS_BASE].sendmsg		= tls_device_sendmsg;
 	prot[TLS_HW][TLS_BASE].sendpage		= tls_device_sendpage;
 
 	prot[TLS_HW][TLS_SW] = prot[TLS_BASE][TLS_SW];
+	prot[TLS_HW][TLS_SW].unhash		= base->unhash;
 	prot[TLS_HW][TLS_SW].sendmsg		= tls_device_sendmsg;
 	prot[TLS_HW][TLS_SW].sendpage		= tls_device_sendpage;
 
 	prot[TLS_BASE][TLS_HW] = prot[TLS_BASE][TLS_SW];
+	prot[TLS_BASE][TLS_HW].unhash		= base->unhash;
 
 	prot[TLS_SW][TLS_HW] = prot[TLS_SW][TLS_SW];
+	prot[TLS_SW][TLS_HW].unhash		= base->unhash;
 
 	prot[TLS_HW][TLS_HW] = prot[TLS_HW][TLS_SW];
 #endif
@@ -794,6 +848,7 @@ static int tls_init(struct sock *sk)
 	tls_build_proto(sk);
 	ctx->tx_conf = TLS_BASE;
 	ctx->rx_conf = TLS_BASE;
+	ctx->sk_proto = sk->sk_prot;
 	update_sk_prot(sk, ctx);
 out:
 	return rc;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf v4 06/14] bpf: sockmap, sock_map_delete needs to use xchg
  2019-07-19 17:29 [PATCH bpf v4 00/14] sockmap/tls fixes Jakub Kicinski
                   ` (4 preceding siblings ...)
  2019-07-19 17:29 ` [PATCH bpf v4 05/14] net/tls: fix transition through disconnect with close Jakub Kicinski
@ 2019-07-19 17:29 ` Jakub Kicinski
  2019-07-19 17:29 ` [PATCH bpf v4 07/14] bpf: sockmap, synchronize_rcu before free'ing map Jakub Kicinski
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 19+ messages in thread
From: Jakub Kicinski @ 2019-07-19 17:29 UTC (permalink / raw)
  To: john.fastabend, alexei.starovoitov, daniel; +Cc: edumazet, netdev, bpf

From: John Fastabend <john.fastabend@gmail.com>

__sock_map_delete() may be called from a tcp event such as unhash or
close from the following trace,

  tcp_bpf_close()
    tcp_bpf_remove()
      sk_psock_unlink()
        sock_map_delete_from_link()
          __sock_map_delete()

In this case the sock lock is held but this only protects against
duplicate removals on the TCP side. If the map is free'd then we have
this trace,

  sock_map_free
    xchg()                  <- replaces map entry
    sock_map_unref()
      sk_psock_put()
        sock_map_del_link()

The __sock_map_delete() call however uses a read, test, null over the
map entry which can result in both paths trying to free the map
entry.

To fix use xchg in TCP paths as well so we avoid having two references
to the same map entry.

Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 net/core/sock_map.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index 52d4faeee18b..28702f2e9a4a 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -276,16 +276,20 @@ static int __sock_map_delete(struct bpf_stab *stab, struct sock *sk_test,
 			     struct sock **psk)
 {
 	struct sock *sk;
+	int err = 0;
 
 	raw_spin_lock_bh(&stab->lock);
 	sk = *psk;
 	if (!sk_test || sk_test == sk)
-		*psk = NULL;
+		sk = xchg(psk, NULL);
+
+	if (likely(sk))
+		sock_map_unref(sk, psk);
+	else
+		err = -EINVAL;
+
 	raw_spin_unlock_bh(&stab->lock);
-	if (unlikely(!sk))
-		return -EINVAL;
-	sock_map_unref(sk, psk);
-	return 0;
+	return err;
 }
 
 static void sock_map_delete_from_link(struct bpf_map *map, struct sock *sk,
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf v4 07/14] bpf: sockmap, synchronize_rcu before free'ing map
  2019-07-19 17:29 [PATCH bpf v4 00/14] sockmap/tls fixes Jakub Kicinski
                   ` (5 preceding siblings ...)
  2019-07-19 17:29 ` [PATCH bpf v4 06/14] bpf: sockmap, sock_map_delete needs to use xchg Jakub Kicinski
@ 2019-07-19 17:29 ` Jakub Kicinski
  2019-07-19 17:29 ` [PATCH bpf v4 08/14] bpf: sockmap, only create entry if ulp is not already enabled Jakub Kicinski
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 19+ messages in thread
From: Jakub Kicinski @ 2019-07-19 17:29 UTC (permalink / raw)
  To: john.fastabend, alexei.starovoitov, daniel; +Cc: edumazet, netdev, bpf

From: John Fastabend <john.fastabend@gmail.com>

We need to have a synchronize_rcu before free'ing the sockmap because
any outstanding psock references will have a pointer to the map and
when they use this could trigger a use after free.

Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 net/core/sock_map.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index 28702f2e9a4a..56bcabe7c2f2 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -247,6 +247,8 @@ static void sock_map_free(struct bpf_map *map)
 	raw_spin_unlock_bh(&stab->lock);
 	rcu_read_unlock();
 
+	synchronize_rcu();
+
 	bpf_map_area_free(stab->sks);
 	kfree(stab);
 }
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf v4 08/14] bpf: sockmap, only create entry if ulp is not already enabled
  2019-07-19 17:29 [PATCH bpf v4 00/14] sockmap/tls fixes Jakub Kicinski
                   ` (6 preceding siblings ...)
  2019-07-19 17:29 ` [PATCH bpf v4 07/14] bpf: sockmap, synchronize_rcu before free'ing map Jakub Kicinski
@ 2019-07-19 17:29 ` Jakub Kicinski
  2019-07-19 17:29 ` [PATCH bpf v4 09/14] bpf: sockmap/tls, close can race with map free Jakub Kicinski
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 19+ messages in thread
From: Jakub Kicinski @ 2019-07-19 17:29 UTC (permalink / raw)
  To: john.fastabend, alexei.starovoitov, daniel; +Cc: edumazet, netdev, bpf

From: John Fastabend <john.fastabend@gmail.com>

Sockmap does not currently support adding sockets after TLS has been
enabled. There never was a real use case for this so it was never
added. But, we lost the test for ULP at some point so add it here
and fail the socket insert if TLS is enabled. Future work could
make sockmap support this use case but fixup the bug here.

Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 net/core/sock_map.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index 56bcabe7c2f2..1330a7442e5b 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -334,6 +334,7 @@ static int sock_map_update_common(struct bpf_map *map, u32 idx,
 				  struct sock *sk, u64 flags)
 {
 	struct bpf_stab *stab = container_of(map, struct bpf_stab, map);
+	struct inet_connection_sock *icsk = inet_csk(sk);
 	struct sk_psock_link *link;
 	struct sk_psock *psock;
 	struct sock *osk;
@@ -344,6 +345,8 @@ static int sock_map_update_common(struct bpf_map *map, u32 idx,
 		return -EINVAL;
 	if (unlikely(idx >= map->max_entries))
 		return -E2BIG;
+	if (unlikely(icsk->icsk_ulp_data))
+		return -EINVAL;
 
 	link = sk_psock_init_link();
 	if (!link)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf v4 09/14] bpf: sockmap/tls, close can race with map free
  2019-07-19 17:29 [PATCH bpf v4 00/14] sockmap/tls fixes Jakub Kicinski
                   ` (7 preceding siblings ...)
  2019-07-19 17:29 ` [PATCH bpf v4 08/14] bpf: sockmap, only create entry if ulp is not already enabled Jakub Kicinski
@ 2019-07-19 17:29 ` Jakub Kicinski
  2019-07-19 17:29 ` [PATCH bpf v4 10/14] selftests/tls: add a test for ULP but no keys Jakub Kicinski
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 19+ messages in thread
From: Jakub Kicinski @ 2019-07-19 17:29 UTC (permalink / raw)
  To: john.fastabend, alexei.starovoitov, daniel
  Cc: edumazet, netdev, bpf, syzbot+06537213db7ba2745c4a,
	Jakub Kicinski, Dirk van der Merwe

From: John Fastabend <john.fastabend@gmail.com>

When a map free is called and in parallel a socket is closed we
have two paths that can potentially reset the socket prot ops, the
bpf close() path and the map free path. This creates a problem
with which prot ops should be used from the socket closed side.

If the map_free side completes first then we want to call the
original lowest level ops. However, if the tls path runs first
we want to call the sockmap ops. Additionally there was no locking
around prot updates in TLS code paths so the prot ops could
be changed multiple times once from TLS path and again from sockmap
side potentially leaving ops pointed at either TLS or sockmap
when psock and/or tls context have already been destroyed.

To fix this race first only update ops inside callback lock
so that TLS, sockmap and lowest level all agree on prot state.
Second and a ULP callback update() so that lower layers can
inform the upper layer when they are being removed allowing the
upper layer to reset prot ops.

This gets us close to allowing sockmap and tls to be stacked
in arbitrary order but will save that patch for *next trees.

v4:
 - make sure we don't free things for device;
 - remove the checks which swap the callbacks back
   only if TLS is at the top.

Reported-by: syzbot+06537213db7ba2745c4a@syzkaller.appspotmail.com
Fixes: 02c558b2d5d6 ("bpf: sockmap, support for msg_peek in sk_msg with redirect ingress")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
---
Let's add the check that TLS's callbacks are on top when stacking in
any order is actually supported. Or perhaps let's just not support
it until we fix the remaining bugs and races?
---
 include/linux/skmsg.h |  8 +++++++-
 include/net/tcp.h     |  3 +++
 net/core/skmsg.c      |  4 ++--
 net/ipv4/tcp_ulp.c    | 13 +++++++++++++
 net/tls/tls_main.c    | 33 ++++++++++++++++++++++++++++-----
 5 files changed, 53 insertions(+), 8 deletions(-)

diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
index 50ced8aba9db..e4b3fb4bb77c 100644
--- a/include/linux/skmsg.h
+++ b/include/linux/skmsg.h
@@ -354,7 +354,13 @@ static inline void sk_psock_restore_proto(struct sock *sk,
 	sk->sk_write_space = psock->saved_write_space;
 
 	if (psock->sk_proto) {
-		sk->sk_prot = psock->sk_proto;
+		struct inet_connection_sock *icsk = inet_csk(sk);
+		bool has_ulp = !!icsk->icsk_ulp_data;
+
+		if (has_ulp)
+			tcp_update_ulp(sk, psock->sk_proto);
+		else
+			sk->sk_prot = psock->sk_proto;
 		psock->sk_proto = NULL;
 	}
 }
diff --git a/include/net/tcp.h b/include/net/tcp.h
index f42d300f0cfa..c82a23470081 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -2103,6 +2103,8 @@ struct tcp_ulp_ops {
 
 	/* initialize ulp */
 	int (*init)(struct sock *sk);
+	/* update ulp */
+	void (*update)(struct sock *sk, struct proto *p);
 	/* cleanup ulp */
 	void (*release)(struct sock *sk);
 
@@ -2114,6 +2116,7 @@ void tcp_unregister_ulp(struct tcp_ulp_ops *type);
 int tcp_set_ulp(struct sock *sk, const char *name);
 void tcp_get_available_ulp(char *buf, size_t len);
 void tcp_cleanup_ulp(struct sock *sk);
+void tcp_update_ulp(struct sock *sk, struct proto *p);
 
 #define MODULE_ALIAS_TCP_ULP(name)				\
 	__MODULE_INFO(alias, alias_userspace, name);		\
diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index 93bffaad2135..6832eeb4b785 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -585,12 +585,12 @@ EXPORT_SYMBOL_GPL(sk_psock_destroy);
 
 void sk_psock_drop(struct sock *sk, struct sk_psock *psock)
 {
-	rcu_assign_sk_user_data(sk, NULL);
 	sk_psock_cork_free(psock);
 	sk_psock_zap_ingress(psock);
-	sk_psock_restore_proto(sk, psock);
 
 	write_lock_bh(&sk->sk_callback_lock);
+	sk_psock_restore_proto(sk, psock);
+	rcu_assign_sk_user_data(sk, NULL);
 	if (psock->progs.skb_parser)
 		sk_psock_stop_strp(sk, psock);
 	write_unlock_bh(&sk->sk_callback_lock);
diff --git a/net/ipv4/tcp_ulp.c b/net/ipv4/tcp_ulp.c
index 3d8a1d835471..4849edb62d52 100644
--- a/net/ipv4/tcp_ulp.c
+++ b/net/ipv4/tcp_ulp.c
@@ -96,6 +96,19 @@ void tcp_get_available_ulp(char *buf, size_t maxlen)
 	rcu_read_unlock();
 }
 
+void tcp_update_ulp(struct sock *sk, struct proto *proto)
+{
+	struct inet_connection_sock *icsk = inet_csk(sk);
+
+	if (!icsk->icsk_ulp_ops) {
+		sk->sk_prot = proto;
+		return;
+	}
+
+	if (icsk->icsk_ulp_ops->update)
+		icsk->icsk_ulp_ops->update(sk, proto);
+}
+
 void tcp_cleanup_ulp(struct sock *sk)
 {
 	struct inet_connection_sock *icsk = inet_csk(sk);
diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index 48f1c26459d0..f208f8455ef2 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -328,7 +328,10 @@ static void tls_sk_proto_unhash(struct sock *sk)
 
 	ctx = tls_get_ctx(sk);
 	tls_sk_proto_cleanup(sk, ctx, timeo);
+	write_lock_bh(&sk->sk_callback_lock);
 	icsk->icsk_ulp_data = NULL;
+	sk->sk_prot = ctx->sk_proto;
+	write_unlock_bh(&sk->sk_callback_lock);
 
 	if (ctx->sk_proto->unhash)
 		ctx->sk_proto->unhash(sk);
@@ -337,7 +340,7 @@ static void tls_sk_proto_unhash(struct sock *sk)
 
 static void tls_sk_proto_close(struct sock *sk, long timeout)
 {
-	void (*sk_proto_close)(struct sock *sk, long timeout);
+	struct inet_connection_sock *icsk = inet_csk(sk);
 	struct tls_context *ctx = tls_get_ctx(sk);
 	long timeo = sock_sndtimeo(sk, 0);
 	bool free_ctx;
@@ -347,12 +350,15 @@ static void tls_sk_proto_close(struct sock *sk, long timeout)
 
 	lock_sock(sk);
 	free_ctx = ctx->tx_conf != TLS_HW && ctx->rx_conf != TLS_HW;
-	sk_proto_close = ctx->sk_proto_close;
 
 	if (ctx->tx_conf != TLS_BASE || ctx->rx_conf != TLS_BASE)
 		tls_sk_proto_cleanup(sk, ctx, timeo);
 
+	write_lock_bh(&sk->sk_callback_lock);
+	if (free_ctx)
+		icsk->icsk_ulp_data = NULL;
 	sk->sk_prot = ctx->sk_proto;
+	write_unlock_bh(&sk->sk_callback_lock);
 	release_sock(sk);
 	if (ctx->tx_conf == TLS_SW)
 		tls_sw_free_ctx_tx(ctx);
@@ -360,7 +366,7 @@ static void tls_sk_proto_close(struct sock *sk, long timeout)
 		tls_sw_strparser_done(ctx);
 	if (ctx->rx_conf == TLS_SW)
 		tls_sw_free_ctx_rx(ctx);
-	sk_proto_close(sk, timeout);
+	ctx->sk_proto_close(sk, timeout);
 
 	if (free_ctx)
 		tls_ctx_free(ctx);
@@ -827,7 +833,7 @@ static int tls_init(struct sock *sk)
 	int rc = 0;
 
 	if (tls_hw_prot(sk))
-		goto out;
+		return 0;
 
 	/* The TLS ulp is currently supported only for TCP sockets
 	 * in ESTABLISHED state.
@@ -838,22 +844,38 @@ static int tls_init(struct sock *sk)
 	if (sk->sk_state != TCP_ESTABLISHED)
 		return -ENOTSUPP;
 
+	tls_build_proto(sk);
+
 	/* allocate tls context */
+	write_lock_bh(&sk->sk_callback_lock);
 	ctx = create_ctx(sk);
 	if (!ctx) {
 		rc = -ENOMEM;
 		goto out;
 	}
 
-	tls_build_proto(sk);
 	ctx->tx_conf = TLS_BASE;
 	ctx->rx_conf = TLS_BASE;
 	ctx->sk_proto = sk->sk_prot;
 	update_sk_prot(sk, ctx);
 out:
+	write_unlock_bh(&sk->sk_callback_lock);
 	return rc;
 }
 
+static void tls_update(struct sock *sk, struct proto *p)
+{
+	struct tls_context *ctx;
+
+	ctx = tls_get_ctx(sk);
+	if (likely(ctx)) {
+		ctx->sk_proto_close = p->close;
+		ctx->sk_proto = p;
+	} else {
+		sk->sk_prot = p;
+	}
+}
+
 void tls_register_device(struct tls_device *device)
 {
 	spin_lock_bh(&device_spinlock);
@@ -874,6 +896,7 @@ static struct tcp_ulp_ops tcp_tls_ulp_ops __read_mostly = {
 	.name			= "tls",
 	.owner			= THIS_MODULE,
 	.init			= tls_init,
+	.update			= tls_update,
 };
 
 static int __init tls_register(void)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf v4 10/14] selftests/tls: add a test for ULP but no keys
  2019-07-19 17:29 [PATCH bpf v4 00/14] sockmap/tls fixes Jakub Kicinski
                   ` (8 preceding siblings ...)
  2019-07-19 17:29 ` [PATCH bpf v4 09/14] bpf: sockmap/tls, close can race with map free Jakub Kicinski
@ 2019-07-19 17:29 ` Jakub Kicinski
  2019-07-19 17:29 ` [PATCH bpf v4 11/14] selftests/tls: test error codes around TLS ULP installation Jakub Kicinski
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 19+ messages in thread
From: Jakub Kicinski @ 2019-07-19 17:29 UTC (permalink / raw)
  To: john.fastabend, alexei.starovoitov, daniel
  Cc: edumazet, netdev, bpf, Jakub Kicinski, Dirk van der Merwe

Make sure we test the TLS_BASE/TLS_BASE case both with data
and the tear down/clean up path.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
---
 tools/testing/selftests/net/tls.c | 74 +++++++++++++++++++++++++++++++
 1 file changed, 74 insertions(+)

diff --git a/tools/testing/selftests/net/tls.c b/tools/testing/selftests/net/tls.c
index 090fff9dbc48..194826fee4f7 100644
--- a/tools/testing/selftests/net/tls.c
+++ b/tools/testing/selftests/net/tls.c
@@ -25,6 +25,80 @@
 #define TLS_PAYLOAD_MAX_LEN 16384
 #define SOL_TLS 282
 
+#ifndef ENOTSUPP
+#define ENOTSUPP 524
+#endif
+
+FIXTURE(tls_basic)
+{
+	int fd, cfd;
+	bool notls;
+};
+
+FIXTURE_SETUP(tls_basic)
+{
+	struct sockaddr_in addr;
+	socklen_t len;
+	int sfd, ret;
+
+	self->notls = false;
+	len = sizeof(addr);
+
+	addr.sin_family = AF_INET;
+	addr.sin_addr.s_addr = htonl(INADDR_ANY);
+	addr.sin_port = 0;
+
+	self->fd = socket(AF_INET, SOCK_STREAM, 0);
+	sfd = socket(AF_INET, SOCK_STREAM, 0);
+
+	ret = bind(sfd, &addr, sizeof(addr));
+	ASSERT_EQ(ret, 0);
+	ret = listen(sfd, 10);
+	ASSERT_EQ(ret, 0);
+
+	ret = getsockname(sfd, &addr, &len);
+	ASSERT_EQ(ret, 0);
+
+	ret = connect(self->fd, &addr, sizeof(addr));
+	ASSERT_EQ(ret, 0);
+
+	self->cfd = accept(sfd, &addr, &len);
+	ASSERT_GE(self->cfd, 0);
+
+	close(sfd);
+
+	ret = setsockopt(self->fd, IPPROTO_TCP, TCP_ULP, "tls", sizeof("tls"));
+	if (ret != 0) {
+		ASSERT_EQ(errno, ENOTSUPP);
+		self->notls = true;
+		printf("Failure setting TCP_ULP, testing without tls\n");
+		return;
+	}
+
+	ret = setsockopt(self->cfd, IPPROTO_TCP, TCP_ULP, "tls", sizeof("tls"));
+	ASSERT_EQ(ret, 0);
+}
+
+FIXTURE_TEARDOWN(tls_basic)
+{
+	close(self->fd);
+	close(self->cfd);
+}
+
+/* Send some data through with ULP but no keys */
+TEST_F(tls_basic, base_base)
+{
+	char const *test_str = "test_read";
+	int send_len = 10;
+	char buf[10];
+
+	ASSERT_EQ(strlen(test_str) + 1, send_len);
+
+	EXPECT_EQ(send(self->fd, test_str, send_len, 0), send_len);
+	EXPECT_NE(recv(self->cfd, buf, send_len, 0), -1);
+	EXPECT_EQ(memcmp(buf, test_str, send_len), 0);
+};
+
 FIXTURE(tls)
 {
 	int fd, cfd;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf v4 11/14] selftests/tls: test error codes around TLS ULP installation
  2019-07-19 17:29 [PATCH bpf v4 00/14] sockmap/tls fixes Jakub Kicinski
                   ` (9 preceding siblings ...)
  2019-07-19 17:29 ` [PATCH bpf v4 10/14] selftests/tls: add a test for ULP but no keys Jakub Kicinski
@ 2019-07-19 17:29 ` Jakub Kicinski
  2019-07-19 17:29 ` [PATCH bpf v4 12/14] selftests/tls: add a bidirectional test Jakub Kicinski
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 19+ messages in thread
From: Jakub Kicinski @ 2019-07-19 17:29 UTC (permalink / raw)
  To: john.fastabend, alexei.starovoitov, daniel
  Cc: edumazet, netdev, bpf, Jakub Kicinski, Dirk van der Merwe

Test the error codes returned when TCP connection is not
in ESTABLISHED state.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
---
 tools/testing/selftests/net/tls.c | 52 +++++++++++++++++++++++++++++++
 1 file changed, 52 insertions(+)

diff --git a/tools/testing/selftests/net/tls.c b/tools/testing/selftests/net/tls.c
index 194826fee4f7..10df77326d34 100644
--- a/tools/testing/selftests/net/tls.c
+++ b/tools/testing/selftests/net/tls.c
@@ -911,6 +911,58 @@ TEST_F(tls, control_msg)
 	EXPECT_EQ(memcmp(buf, test_str, send_len), 0);
 }
 
+TEST(non_established) {
+	struct tls12_crypto_info_aes_gcm_256 tls12;
+	struct sockaddr_in addr;
+	int sfd, ret, fd;
+	socklen_t len;
+
+	len = sizeof(addr);
+
+	memset(&tls12, 0, sizeof(tls12));
+	tls12.info.version = TLS_1_2_VERSION;
+	tls12.info.cipher_type = TLS_CIPHER_AES_GCM_256;
+
+	addr.sin_family = AF_INET;
+	addr.sin_addr.s_addr = htonl(INADDR_ANY);
+	addr.sin_port = 0;
+
+	fd = socket(AF_INET, SOCK_STREAM, 0);
+	sfd = socket(AF_INET, SOCK_STREAM, 0);
+
+	ret = bind(sfd, &addr, sizeof(addr));
+	ASSERT_EQ(ret, 0);
+	ret = listen(sfd, 10);
+	ASSERT_EQ(ret, 0);
+
+	ret = setsockopt(fd, IPPROTO_TCP, TCP_ULP, "tls", sizeof("tls"));
+	EXPECT_EQ(ret, -1);
+	/* TLS ULP not supported */
+	if (errno == ENOENT)
+		return;
+	EXPECT_EQ(errno, ENOTSUPP);
+
+	ret = setsockopt(sfd, IPPROTO_TCP, TCP_ULP, "tls", sizeof("tls"));
+	EXPECT_EQ(ret, -1);
+	EXPECT_EQ(errno, ENOTSUPP);
+
+	ret = getsockname(sfd, &addr, &len);
+	ASSERT_EQ(ret, 0);
+
+	ret = connect(fd, &addr, sizeof(addr));
+	ASSERT_EQ(ret, 0);
+
+	ret = setsockopt(fd, IPPROTO_TCP, TCP_ULP, "tls", sizeof("tls"));
+	ASSERT_EQ(ret, 0);
+
+	ret = setsockopt(fd, IPPROTO_TCP, TCP_ULP, "tls", sizeof("tls"));
+	EXPECT_EQ(ret, -1);
+	EXPECT_EQ(errno, EEXIST);
+
+	close(fd);
+	close(sfd);
+}
+
 TEST(keysizes) {
 	struct tls12_crypto_info_aes_gcm_256 tls12;
 	struct sockaddr_in addr;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf v4 12/14] selftests/tls: add a bidirectional test
  2019-07-19 17:29 [PATCH bpf v4 00/14] sockmap/tls fixes Jakub Kicinski
                   ` (10 preceding siblings ...)
  2019-07-19 17:29 ` [PATCH bpf v4 11/14] selftests/tls: test error codes around TLS ULP installation Jakub Kicinski
@ 2019-07-19 17:29 ` Jakub Kicinski
  2019-07-19 17:29 ` [PATCH bpf v4 13/14] selftests/tls: close the socket with open record Jakub Kicinski
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 19+ messages in thread
From: Jakub Kicinski @ 2019-07-19 17:29 UTC (permalink / raw)
  To: john.fastabend, alexei.starovoitov, daniel
  Cc: edumazet, netdev, bpf, Jakub Kicinski, Dirk van der Merwe

Add a simple test which installs the TLS state for both directions,
sends and receives data on both sockets.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
---
 tools/testing/selftests/net/tls.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/tools/testing/selftests/net/tls.c b/tools/testing/selftests/net/tls.c
index 10df77326d34..6d78bd050813 100644
--- a/tools/testing/selftests/net/tls.c
+++ b/tools/testing/selftests/net/tls.c
@@ -684,6 +684,37 @@ TEST_F(tls, recv_lowat)
 	EXPECT_EQ(memcmp(send_mem, recv_mem + 10, 5), 0);
 }
 
+TEST_F(tls, bidir)
+{
+	struct tls12_crypto_info_aes_gcm_128 tls12;
+	char const *test_str = "test_read";
+	int send_len = 10;
+	char buf[10];
+	int ret;
+
+	memset(&tls12, 0, sizeof(tls12));
+	tls12.info.version = TLS_1_3_VERSION;
+	tls12.info.cipher_type = TLS_CIPHER_AES_GCM_128;
+
+	ret = setsockopt(self->fd, SOL_TLS, TLS_RX, &tls12, sizeof(tls12));
+	ASSERT_EQ(ret, 0);
+
+	ret = setsockopt(self->cfd, SOL_TLS, TLS_TX, &tls12, sizeof(tls12));
+	ASSERT_EQ(ret, 0);
+
+	ASSERT_EQ(strlen(test_str) + 1, send_len);
+
+	EXPECT_EQ(send(self->fd, test_str, send_len, 0), send_len);
+	EXPECT_NE(recv(self->cfd, buf, send_len, 0), -1);
+	EXPECT_EQ(memcmp(buf, test_str, send_len), 0);
+
+	memset(buf, 0, sizeof(buf));
+
+	EXPECT_EQ(send(self->cfd, test_str, send_len, 0), send_len);
+	EXPECT_NE(recv(self->fd, buf, send_len, 0), -1);
+	EXPECT_EQ(memcmp(buf, test_str, send_len), 0);
+};
+
 TEST_F(tls, pollin)
 {
 	char const *test_str = "test_poll";
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf v4 13/14] selftests/tls: close the socket with open record
  2019-07-19 17:29 [PATCH bpf v4 00/14] sockmap/tls fixes Jakub Kicinski
                   ` (11 preceding siblings ...)
  2019-07-19 17:29 ` [PATCH bpf v4 12/14] selftests/tls: add a bidirectional test Jakub Kicinski
@ 2019-07-19 17:29 ` Jakub Kicinski
  2019-07-19 17:29 ` [PATCH bpf v4 14/14] selftests/tls: add shutdown tests Jakub Kicinski
  2019-07-19 17:37 ` [PATCH bpf v4 00/14] sockmap/tls fixes Jakub Kicinski
  14 siblings, 0 replies; 19+ messages in thread
From: Jakub Kicinski @ 2019-07-19 17:29 UTC (permalink / raw)
  To: john.fastabend, alexei.starovoitov, daniel
  Cc: edumazet, netdev, bpf, Jakub Kicinski, Dirk van der Merwe

Add test which sends some data with MSG_MORE and then
closes the socket (never calling send without MSG_MORE).
This should make sure we clean up open records correctly.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
---
 tools/testing/selftests/net/tls.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/tools/testing/selftests/net/tls.c b/tools/testing/selftests/net/tls.c
index 6d78bd050813..94a86ca882de 100644
--- a/tools/testing/selftests/net/tls.c
+++ b/tools/testing/selftests/net/tls.c
@@ -239,6 +239,16 @@ TEST_F(tls, msg_more)
 	EXPECT_EQ(memcmp(buf, test_str, send_len), 0);
 }
 
+TEST_F(tls, msg_more_unsent)
+{
+	char const *test_str = "test_read";
+	int send_len = 10;
+	char buf[10];
+
+	EXPECT_EQ(send(self->fd, test_str, send_len, MSG_MORE), send_len);
+	EXPECT_EQ(recv(self->cfd, buf, send_len, MSG_DONTWAIT), -1);
+}
+
 TEST_F(tls, sendmsg_single)
 {
 	struct msghdr msg;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf v4 14/14] selftests/tls: add shutdown tests
  2019-07-19 17:29 [PATCH bpf v4 00/14] sockmap/tls fixes Jakub Kicinski
                   ` (12 preceding siblings ...)
  2019-07-19 17:29 ` [PATCH bpf v4 13/14] selftests/tls: close the socket with open record Jakub Kicinski
@ 2019-07-19 17:29 ` Jakub Kicinski
  2019-07-19 17:37 ` [PATCH bpf v4 00/14] sockmap/tls fixes Jakub Kicinski
  14 siblings, 0 replies; 19+ messages in thread
From: Jakub Kicinski @ 2019-07-19 17:29 UTC (permalink / raw)
  To: john.fastabend, alexei.starovoitov, daniel
  Cc: edumazet, netdev, bpf, Jakub Kicinski, Dirk van der Merwe

Add test for killing the connection via shutdown.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
---
 tools/testing/selftests/net/tls.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/tools/testing/selftests/net/tls.c b/tools/testing/selftests/net/tls.c
index 94a86ca882de..630c5b884d43 100644
--- a/tools/testing/selftests/net/tls.c
+++ b/tools/testing/selftests/net/tls.c
@@ -952,6 +952,33 @@ TEST_F(tls, control_msg)
 	EXPECT_EQ(memcmp(buf, test_str, send_len), 0);
 }
 
+TEST_F(tls, shutdown)
+{
+	char const *test_str = "test_read";
+	int send_len = 10;
+	char buf[10];
+
+	ASSERT_EQ(strlen(test_str) + 1, send_len);
+
+	EXPECT_EQ(send(self->fd, test_str, send_len, 0), send_len);
+	EXPECT_NE(recv(self->cfd, buf, send_len, 0), -1);
+	EXPECT_EQ(memcmp(buf, test_str, send_len), 0);
+
+	shutdown(self->fd, SHUT_RDWR);
+	shutdown(self->cfd, SHUT_RDWR);
+}
+
+TEST_F(tls, shutdown_unsent)
+{
+	char const *test_str = "test_read";
+	int send_len = 10;
+
+	EXPECT_EQ(send(self->fd, test_str, send_len, MSG_MORE), send_len);
+
+	shutdown(self->fd, SHUT_RDWR);
+	shutdown(self->cfd, SHUT_RDWR);
+}
+
 TEST(non_established) {
 	struct tls12_crypto_info_aes_gcm_256 tls12;
 	struct sockaddr_in addr;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf v4 00/14] sockmap/tls fixes
  2019-07-19 17:29 [PATCH bpf v4 00/14] sockmap/tls fixes Jakub Kicinski
                   ` (13 preceding siblings ...)
  2019-07-19 17:29 ` [PATCH bpf v4 14/14] selftests/tls: add shutdown tests Jakub Kicinski
@ 2019-07-19 17:37 ` Jakub Kicinski
  2019-07-22 14:22   ` Daniel Borkmann
  2019-07-22 15:46   ` John Fastabend
  14 siblings, 2 replies; 19+ messages in thread
From: Jakub Kicinski @ 2019-07-19 17:37 UTC (permalink / raw)
  To: john.fastabend, alexei.starovoitov, daniel
  Cc: edumazet, netdev, bpf, oss-drivers

On Fri, 19 Jul 2019 10:29:13 -0700, Jakub Kicinski wrote:
> John says:
> 
> Resolve a series of splats discovered by syzbot and an unhash
> TLS issue noted by Eric Dumazet.

Sorry for the delay, this code is quite tricky. According to my testing
TLS SW and HW should now work, I hope I didn't regress things on the
sockmap side.

This is not solving all the issues (ugh), apart from HW needing the
unhash/shutdown treatment, as discussed we may have a sender stuck in
wmem wait while we free context underneath. That's a "minor" UAF for
another day..

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf v4 00/14] sockmap/tls fixes
  2019-07-19 17:37 ` [PATCH bpf v4 00/14] sockmap/tls fixes Jakub Kicinski
@ 2019-07-22 14:22   ` Daniel Borkmann
  2019-07-22 15:48     ` John Fastabend
  2019-07-22 15:46   ` John Fastabend
  1 sibling, 1 reply; 19+ messages in thread
From: Daniel Borkmann @ 2019-07-22 14:22 UTC (permalink / raw)
  To: Jakub Kicinski, john.fastabend, alexei.starovoitov
  Cc: edumazet, netdev, bpf, oss-drivers

On 7/19/19 7:37 PM, Jakub Kicinski wrote:
> On Fri, 19 Jul 2019 10:29:13 -0700, Jakub Kicinski wrote:
>> John says:
>>
>> Resolve a series of splats discovered by syzbot and an unhash
>> TLS issue noted by Eric Dumazet.
> 
> Sorry for the delay, this code is quite tricky. According to my testing
> TLS SW and HW should now work, I hope I didn't regress things on the
> sockmap side.

Applied, thanks everyone!

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf v4 00/14] sockmap/tls fixes
  2019-07-19 17:37 ` [PATCH bpf v4 00/14] sockmap/tls fixes Jakub Kicinski
  2019-07-22 14:22   ` Daniel Borkmann
@ 2019-07-22 15:46   ` John Fastabend
  1 sibling, 0 replies; 19+ messages in thread
From: John Fastabend @ 2019-07-22 15:46 UTC (permalink / raw)
  To: Jakub Kicinski, john.fastabend, alexei.starovoitov, daniel
  Cc: edumazet, netdev, bpf, oss-drivers

Jakub Kicinski wrote:
> On Fri, 19 Jul 2019 10:29:13 -0700, Jakub Kicinski wrote:
> > John says:
> > 
> > Resolve a series of splats discovered by syzbot and an unhash
> > TLS issue noted by Eric Dumazet.
> 
> Sorry for the delay, this code is quite tricky. According to my testing
> TLS SW and HW should now work, I hope I didn't regress things on the
> sockmap side.

I'll run it through our CI as well but looks good to me. Thanks a lot
for getting this finished up.

> 
> This is not solving all the issues (ugh), apart from HW needing the
> unhash/shutdown treatment, as discussed we may have a sender stuck in
> wmem wait while we free context underneath. That's a "minor" UAF for
> another day..

Agreed. But this should solve most of the syzbot issues and a few
crashes we saw in testing real workloads.

Thanks,
John

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf v4 00/14] sockmap/tls fixes
  2019-07-22 14:22   ` Daniel Borkmann
@ 2019-07-22 15:48     ` John Fastabend
  0 siblings, 0 replies; 19+ messages in thread
From: John Fastabend @ 2019-07-22 15:48 UTC (permalink / raw)
  To: Daniel Borkmann, Jakub Kicinski, john.fastabend, alexei.starovoitov
  Cc: edumazet, netdev, bpf, oss-drivers

Daniel Borkmann wrote:
> On 7/19/19 7:37 PM, Jakub Kicinski wrote:
> > On Fri, 19 Jul 2019 10:29:13 -0700, Jakub Kicinski wrote:
> >> John says:
> >>
> >> Resolve a series of splats discovered by syzbot and an unhash
> >> TLS issue noted by Eric Dumazet.
> > 
> > Sorry for the delay, this code is quite tricky. According to my testing
> > TLS SW and HW should now work, I hope I didn't regress things on the
> > sockmap side.
> 
> Applied, thanks everyone!

Thanks Jakub, for the patches without my signed-off already

Acked-by: John Fastabend <john.fastabend@gmail.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2019-07-22 15:48 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-19 17:29 [PATCH bpf v4 00/14] sockmap/tls fixes Jakub Kicinski
2019-07-19 17:29 ` [PATCH bpf v4 01/14] net/tls: don't arm strparser immediately in tls_set_sw_offload() Jakub Kicinski
2019-07-19 17:29 ` [PATCH bpf v4 02/14] net/tls: don't call tls_sk_proto_close for hw record offload Jakub Kicinski
2019-07-19 17:29 ` [PATCH bpf v4 03/14] net/tls: remove close callback sock unlock/lock around TX work flush Jakub Kicinski
2019-07-19 17:29 ` [PATCH bpf v4 04/14] net/tls: remove sock unlock/lock around strp_done() Jakub Kicinski
2019-07-19 17:29 ` [PATCH bpf v4 05/14] net/tls: fix transition through disconnect with close Jakub Kicinski
2019-07-19 17:29 ` [PATCH bpf v4 06/14] bpf: sockmap, sock_map_delete needs to use xchg Jakub Kicinski
2019-07-19 17:29 ` [PATCH bpf v4 07/14] bpf: sockmap, synchronize_rcu before free'ing map Jakub Kicinski
2019-07-19 17:29 ` [PATCH bpf v4 08/14] bpf: sockmap, only create entry if ulp is not already enabled Jakub Kicinski
2019-07-19 17:29 ` [PATCH bpf v4 09/14] bpf: sockmap/tls, close can race with map free Jakub Kicinski
2019-07-19 17:29 ` [PATCH bpf v4 10/14] selftests/tls: add a test for ULP but no keys Jakub Kicinski
2019-07-19 17:29 ` [PATCH bpf v4 11/14] selftests/tls: test error codes around TLS ULP installation Jakub Kicinski
2019-07-19 17:29 ` [PATCH bpf v4 12/14] selftests/tls: add a bidirectional test Jakub Kicinski
2019-07-19 17:29 ` [PATCH bpf v4 13/14] selftests/tls: close the socket with open record Jakub Kicinski
2019-07-19 17:29 ` [PATCH bpf v4 14/14] selftests/tls: add shutdown tests Jakub Kicinski
2019-07-19 17:37 ` [PATCH bpf v4 00/14] sockmap/tls fixes Jakub Kicinski
2019-07-22 14:22   ` Daniel Borkmann
2019-07-22 15:48     ` John Fastabend
2019-07-22 15:46   ` John Fastabend

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.