All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next] net/tls: prevent skb_orphan() from leaking TLS plain text with offload
@ 2019-07-30 21:12 Jakub Kicinski
  2019-07-31 13:57 ` Boris Pismenny
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Jakub Kicinski @ 2019-07-30 21:12 UTC (permalink / raw)
  To: davem
  Cc: netdev, oss-drivers, edumazet, borisp, aviadye, davejwatson,
	john.fastabend, daniel, willemb, xiyou.wangcong, fw,
	alexei.starovoitov, Jakub Kicinski

sk_validate_xmit_skb() and drivers depend on the sk member of
struct sk_buff to identify segments requiring encryption.
Any operation which removes or does not preserve the original TLS
socket such as skb_orphan() or skb_clone() will cause clear text
leaks.

Make the TCP socket underlying an offloaded TLS connection
mark all skbs as decrypted, if TLS TX is in offload mode.
Then in sk_validate_xmit_skb() catch skbs which have no socket
(or a socket with no validation) and decrypted flag set.

Note that CONFIG_SOCK_VALIDATE_XMIT, CONFIG_TLS_DEVICE and
sk->sk_validate_xmit_skb are slightly interchangeable right now,
they all imply TLS offload. The new checks are guarded by
CONFIG_TLS_DEVICE because that's the option guarding the
sk_buff->decrypted member.

Second, smaller issue with orphaning is that it breaks
the guarantee that packets will be delivered to device
queues in-order. All TLS offload drivers depend on that
scheduling property. This means skb_orphan_partial()'s
trick of preserving partial socket references will cause
issues in the drivers. We need a full orphan, and as a
result netem delay/throttling will cause all TLS offload
skbs to be dropped.

Reusing the sk_buff->decrypted flag also protects from
leaking clear text when incoming, decrypted skb is redirected
(e.g. by TC).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
I'm sending this for net-next because of lack of confidence
in my own abilities. It should apply cleanly to net... :)

 Documentation/networking/tls-offload.rst |  9 --------
 include/net/sock.h                       | 28 +++++++++++++++++++++++-
 net/core/skbuff.c                        |  3 +++
 net/core/sock.c                          | 22 ++++++++++++++-----
 net/ipv4/tcp.c                           |  4 +++-
 net/ipv4/tcp_output.c                    |  3 +++
 net/tls/tls_device.c                     |  2 ++
 7 files changed, 55 insertions(+), 16 deletions(-)

diff --git a/Documentation/networking/tls-offload.rst b/Documentation/networking/tls-offload.rst
index 048e5ca44824..2bc3ab5515d8 100644
--- a/Documentation/networking/tls-offload.rst
+++ b/Documentation/networking/tls-offload.rst
@@ -499,15 +499,6 @@ offloads, old connections will remain active after flags are cleared.
 Known bugs
 ==========
 
-skb_orphan() leaks clear text
------------------------------
-
-Currently drivers depend on the :c:member:`sk` member of
-:c:type:`struct sk_buff <sk_buff>` to identify segments requiring
-encryption. Any operation which removes or does not preserve the socket
-association such as :c:func:`skb_orphan` or :c:func:`skb_clone`
-will cause the driver to miss the packets and lead to clear text leaks.
-
 Redirects leak clear text
 -------------------------
 
diff --git a/include/net/sock.h b/include/net/sock.h
index 228db3998e46..90f3f552c789 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -814,6 +814,7 @@ enum sock_flags {
 	SOCK_TXTIME,
 	SOCK_XDP, /* XDP is attached */
 	SOCK_TSTAMP_NEW, /* Indicates 64 bit timestamps always */
+	SOCK_CRYPTO_TX_PLAIN_TEXT, /* Generate skbs with decrypted flag set */
 };
 
 #define SK_FLAGS_TIMESTAMP ((1UL << SOCK_TIMESTAMP) | (1UL << SOCK_TIMESTAMPING_RX_SOFTWARE))
@@ -2480,8 +2481,26 @@ static inline bool sk_fullsock(const struct sock *sk)
 	return (1 << sk->sk_state) & ~(TCPF_TIME_WAIT | TCPF_NEW_SYN_RECV);
 }
 
+static inline void sk_set_tx_crypto(const struct sock *sk, struct sk_buff *skb)
+{
+#ifdef CONFIG_TLS_DEVICE
+	skb->decrypted = sock_flag(sk, SOCK_CRYPTO_TX_PLAIN_TEXT);
+#endif
+}
+
+static inline bool sk_tx_crypto_match(const struct sock *sk,
+				      const struct sk_buff *skb)
+{
+#ifdef CONFIG_TLS_DEVICE
+	return sock_flag(sk, SOCK_CRYPTO_TX_PLAIN_TEXT) == !!skb->decrypted;
+#else
+	return true;
+#endif
+}
+
 /* Checks if this SKB belongs to an HW offloaded socket
  * and whether any SW fallbacks are required based on dev.
+ * Check decrypted mark in case skb_orphan() cleared socket.
  */
 static inline struct sk_buff *sk_validate_xmit_skb(struct sk_buff *skb,
 						   struct net_device *dev)
@@ -2489,8 +2508,15 @@ static inline struct sk_buff *sk_validate_xmit_skb(struct sk_buff *skb,
 #ifdef CONFIG_SOCK_VALIDATE_XMIT
 	struct sock *sk = skb->sk;
 
-	if (sk && sk_fullsock(sk) && sk->sk_validate_xmit_skb)
+	if (sk && sk_fullsock(sk) && sk->sk_validate_xmit_skb) {
 		skb = sk->sk_validate_xmit_skb(sk, dev, skb);
+#ifdef CONFIG_TLS_DEVICE
+	} else if (unlikely(skb->decrypted)) {
+		pr_warn_ratelimited("unencrypted skb with no associated socket - dropping\n");
+		kfree_skb(skb);
+		skb = NULL;
+#endif
+	}
 #endif
 
 	return skb;
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 0b788df5a75b..9e92684479b9 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3794,6 +3794,9 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
 
 			skb_reserve(nskb, headroom);
 			__skb_put(nskb, doffset);
+#ifdef CONFIG_TLS_DEVICE
+			nskb->decrypted = head_skb->decrypted;
+#endif
 		}
 
 		if (segs)
diff --git a/net/core/sock.c b/net/core/sock.c
index d57b0cc995a0..b0c10b518e65 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1992,6 +1992,22 @@ void skb_set_owner_w(struct sk_buff *skb, struct sock *sk)
 }
 EXPORT_SYMBOL(skb_set_owner_w);
 
+static bool can_skb_orphan_partial(const struct sk_buff *skb)
+{
+#ifdef CONFIG_TLS_DEVICE
+	/* Drivers depend on in-order delivery for crypto offload,
+	 * partial orphan breaks out-of-order-OK logic.
+	 */
+	if (skb->decrypted)
+		return false;
+#endif
+#ifdef CONFIG_INET
+	if (skb->destructor == tcp_wfree)
+		return true;
+#endif
+	return skb->destructor == sock_wfree;
+}
+
 /* This helper is used by netem, as it can hold packets in its
  * delay queue. We want to allow the owner socket to send more
  * packets, as if they were already TX completed by a typical driver.
@@ -2003,11 +2019,7 @@ void skb_orphan_partial(struct sk_buff *skb)
 	if (skb_is_tcp_pure_ack(skb))
 		return;
 
-	if (skb->destructor == sock_wfree
-#ifdef CONFIG_INET
-	    || skb->destructor == tcp_wfree
-#endif
-		) {
+	if (can_skb_orphan_partial(skb)) {
 		struct sock *sk = skb->sk;
 
 		if (refcount_inc_not_zero(&sk->sk_refcnt)) {
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index f62f0e7e3cdd..dee93eab02fe 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -984,6 +984,7 @@ ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset,
 			if (!skb)
 				goto wait_for_memory;
 
+			sk_set_tx_crypto(sk, skb);
 			skb_entail(sk, skb);
 			copy = size_goal;
 		}
@@ -993,7 +994,8 @@ ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset,
 
 		i = skb_shinfo(skb)->nr_frags;
 		can_coalesce = skb_can_coalesce(skb, i, page, offset);
-		if (!can_coalesce && i >= sysctl_max_skb_frags) {
+		if ((!can_coalesce && i >= sysctl_max_skb_frags) ||
+		    !sk_tx_crypto_match(sk, skb)) {
 			tcp_mark_push(tp, skb);
 			goto new_segment;
 		}
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 6e4afc48d7bb..9efd0ca44d49 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1320,6 +1320,7 @@ int tcp_fragment(struct sock *sk, enum tcp_queue tcp_queue,
 	buff = sk_stream_alloc_skb(sk, nsize, gfp, true);
 	if (!buff)
 		return -ENOMEM; /* We'll just try again later. */
+	sk_set_tx_crypto(sk, buff);
 
 	sk->sk_wmem_queued += buff->truesize;
 	sk_mem_charge(sk, buff->truesize);
@@ -1874,6 +1875,7 @@ static int tso_fragment(struct sock *sk, struct sk_buff *skb, unsigned int len,
 	buff = sk_stream_alloc_skb(sk, 0, gfp, true);
 	if (unlikely(!buff))
 		return -ENOMEM;
+	sk_set_tx_crypto(sk, buff);
 
 	sk->sk_wmem_queued += buff->truesize;
 	sk_mem_charge(sk, buff->truesize);
@@ -2139,6 +2141,7 @@ static int tcp_mtu_probe(struct sock *sk)
 	nskb = sk_stream_alloc_skb(sk, probe_size, GFP_ATOMIC, false);
 	if (!nskb)
 		return -1;
+	sk_set_tx_crypto(sk, nskb);
 	sk->sk_wmem_queued += nskb->truesize;
 	sk_mem_charge(sk, nskb->truesize);
 
diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
index 4ec8a06fa5d1..3d78742b3b1b 100644
--- a/net/tls/tls_device.c
+++ b/net/tls/tls_device.c
@@ -970,6 +970,8 @@ int tls_set_device_offload(struct sock *sk, struct tls_context *ctx)
 
 	tls_device_attach(ctx, sk, netdev);
 
+	sock_set_flag(sk, SOCK_CRYPTO_TX_PLAIN_TEXT);
+
 	/* following this assignment tls_is_sk_tx_device_offloaded
 	 * will return true and the context might be accessed
 	 * by the netdev's xmit function.
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next] net/tls: prevent skb_orphan() from leaking TLS plain text with offload
  2019-07-30 21:12 [PATCH net-next] net/tls: prevent skb_orphan() from leaking TLS plain text with offload Jakub Kicinski
@ 2019-07-31 13:57 ` Boris Pismenny
  2019-07-31 18:43   ` Jakub Kicinski
  2019-07-31 15:57 ` Willem de Bruijn
  2019-08-03  0:24 ` David Miller
  2 siblings, 1 reply; 8+ messages in thread
From: Boris Pismenny @ 2019-07-31 13:57 UTC (permalink / raw)
  To: Jakub Kicinski, davem
  Cc: netdev, oss-drivers, edumazet, Aviad Yehezkel, davejwatson,
	john.fastabend, daniel, willemb, xiyou.wangcong, fw,
	alexei.starovoitov

Hi Jacub,

On 7/31/2019 12:12 AM, Jakub Kicinski wrote:
> sk_validate_xmit_skb() and drivers depend on the sk member of
> struct sk_buff to identify segments requiring encryption.
> Any operation which removes or does not preserve the original TLS
> socket such as skb_orphan() or skb_clone() will cause clear text
> leaks.
>
> Make the TCP socket underlying an offloaded TLS connection
> mark all skbs as decrypted, if TLS TX is in offload mode.
> Then in sk_validate_xmit_skb() catch skbs which have no socket
> (or a socket with no validation) and decrypted flag set.
>
> Note that CONFIG_SOCK_VALIDATE_XMIT, CONFIG_TLS_DEVICE and
> sk->sk_validate_xmit_skb are slightly interchangeable right now,
> they all imply TLS offload. The new checks are guarded by
> CONFIG_TLS_DEVICE because that's the option guarding the
> sk_buff->decrypted member.
>
> Second, smaller issue with orphaning is that it breaks
> the guarantee that packets will be delivered to device
> queues in-order. All TLS offload drivers depend on that
> scheduling property. This means skb_orphan_partial()'s
> trick of preserving partial socket references will cause
> issues in the drivers. We need a full orphan, and as a
> result netem delay/throttling will cause all TLS offload
> skbs to be dropped.
>
> Reusing the sk_buff->decrypted flag also protects from
> leaking clear text when incoming, decrypted skb is redirected
> (e.g. by TC).

Nice!

>
> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> ---
> I'm sending this for net-next because of lack of confidence
> in my own abilities. It should apply cleanly to net... :)
>
>   Documentation/networking/tls-offload.rst |  9 --------
>   include/net/sock.h                       | 28 +++++++++++++++++++++++-
>   net/core/skbuff.c                        |  3 +++
>   net/core/sock.c                          | 22 ++++++++++++++-----
>   net/ipv4/tcp.c                           |  4 +++-
>   net/ipv4/tcp_output.c                    |  3 +++
>   net/tls/tls_device.c                     |  2 ++
>   7 files changed, 55 insertions(+), 16 deletions(-)
>
> diff --git a/Documentation/networking/tls-offload.rst b/Documentation/networking/tls-offload.rst
> index 048e5ca44824..2bc3ab5515d8 100644
> --- a/Documentation/networking/tls-offload.rst
> +++ b/Documentation/networking/tls-offload.rst
> @@ -499,15 +499,6 @@ offloads, old connections will remain active after flags are cleared.
>   Known bugs
>   ==========
>   
> -skb_orphan() leaks clear text
> ------------------------------
> -
> -Currently drivers depend on the :c:member:`sk` member of
> -:c:type:`struct sk_buff <sk_buff>` to identify segments requiring
> -encryption. Any operation which removes or does not preserve the socket
> -association such as :c:func:`skb_orphan` or :c:func:`skb_clone`
> -will cause the driver to miss the packets and lead to clear text leaks.
> -
>   Redirects leak clear text
>   -------------------------
Doesn't this patch cover the redirect case as well?
>   
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 228db3998e46..90f3f552c789 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -814,6 +814,7 @@ enum sock_flags {
>   	SOCK_TXTIME,
>   	SOCK_XDP, /* XDP is attached */
>   	SOCK_TSTAMP_NEW, /* Indicates 64 bit timestamps always */
> +	SOCK_CRYPTO_TX_PLAIN_TEXT, /* Generate skbs with decrypted flag set */
>   };
>   
>   #define SK_FLAGS_TIMESTAMP ((1UL << SOCK_TIMESTAMP) | (1UL << SOCK_TIMESTAMPING_RX_SOFTWARE))
> @@ -2480,8 +2481,26 @@ static inline bool sk_fullsock(const struct sock *sk)
>   	return (1 << sk->sk_state) & ~(TCPF_TIME_WAIT | TCPF_NEW_SYN_RECV);
>   }
>   
> +static inline void sk_set_tx_crypto(const struct sock *sk, struct sk_buff *skb)
> +{
> +#ifdef CONFIG_TLS_DEVICE
> +	skb->decrypted = sock_flag(sk, SOCK_CRYPTO_TX_PLAIN_TEXT);
> +#endif
> +}

Have you considered the following alternative to calling 
sk_set_tx_crypto() - Add a new MSG_DECRYPTE flag that will set the 
skb->decrypted bit in the do_tcp_sendpages function.

Then, the rest of the TCP code can propagate this bit from the existing skb.

> +
> +static inline bool sk_tx_crypto_match(const struct sock *sk,
> +				      const struct sk_buff *skb)
> +{
> +#ifdef CONFIG_TLS_DEVICE
> +	return sock_flag(sk, SOCK_CRYPTO_TX_PLAIN_TEXT) == !!skb->decrypted;
> +#else
> +	return true;
> +#endif
> +}
> +
>   /* Checks if this SKB belongs to an HW offloaded socket
>    * and whether any SW fallbacks are required based on dev.
> + * Check decrypted mark in case skb_orphan() cleared socket.
>    */
>   static inline struct sk_buff *sk_validate_xmit_skb(struct sk_buff *skb,
>   						   struct net_device *dev)
> @@ -2489,8 +2508,15 @@ static inline struct sk_buff *sk_validate_xmit_skb(struct sk_buff *skb,
>   #ifdef CONFIG_SOCK_VALIDATE_XMIT
>   	struct sock *sk = skb->sk;
>   
> -	if (sk && sk_fullsock(sk) && sk->sk_validate_xmit_skb)
> +	if (sk && sk_fullsock(sk) && sk->sk_validate_xmit_skb) {
>   		skb = sk->sk_validate_xmit_skb(sk, dev, skb);
> +#ifdef CONFIG_TLS_DEVICE
> +	} else if (unlikely(skb->decrypted)) {
> +		pr_warn_ratelimited("unencrypted skb with no associated socket - dropping\n");
> +		kfree_skb(skb);
> +		skb = NULL;
> +#endif
> +	}
>   #endif
>   
>   	return skb;
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 0b788df5a75b..9e92684479b9 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -3794,6 +3794,9 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
>   
>   			skb_reserve(nskb, headroom);
>   			__skb_put(nskb, doffset);
> +#ifdef CONFIG_TLS_DEVICE
> +			nskb->decrypted = head_skb->decrypted;
> +#endif
>   		}
>   
>   		if (segs)
> diff --git a/net/core/sock.c b/net/core/sock.c
> index d57b0cc995a0..b0c10b518e65 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -1992,6 +1992,22 @@ void skb_set_owner_w(struct sk_buff *skb, struct sock *sk)
>   }
>   EXPORT_SYMBOL(skb_set_owner_w);
>   
> +static bool can_skb_orphan_partial(const struct sk_buff *skb)
> +{
> +#ifdef CONFIG_TLS_DEVICE
> +	/* Drivers depend on in-order delivery for crypto offload,
> +	 * partial orphan breaks out-of-order-OK logic.
> +	 */
> +	if (skb->decrypted)
> +		return false;
> +#endif
> +#ifdef CONFIG_INET
> +	if (skb->destructor == tcp_wfree)
> +		return true;
> +#endif
> +	return skb->destructor == sock_wfree;
> +}
> +
>   /* This helper is used by netem, as it can hold packets in its
>    * delay queue. We want to allow the owner socket to send more
>    * packets, as if they were already TX completed by a typical driver.
> @@ -2003,11 +2019,7 @@ void skb_orphan_partial(struct sk_buff *skb)
>   	if (skb_is_tcp_pure_ack(skb))
>   		return;
>   
> -	if (skb->destructor == sock_wfree
> -#ifdef CONFIG_INET
> -	    || skb->destructor == tcp_wfree
> -#endif
> -		) {
> +	if (can_skb_orphan_partial(skb)) {
>   		struct sock *sk = skb->sk;
>   
>   		if (refcount_inc_not_zero(&sk->sk_refcnt)) {
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index f62f0e7e3cdd..dee93eab02fe 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -984,6 +984,7 @@ ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset,
>   			if (!skb)
>   				goto wait_for_memory;
>   
> +			sk_set_tx_crypto(sk, skb);
>   			skb_entail(sk, skb);
>   			copy = size_goal;
>   		}
> @@ -993,7 +994,8 @@ ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset,
>   
>   		i = skb_shinfo(skb)->nr_frags;
>   		can_coalesce = skb_can_coalesce(skb, i, page, offset);
> -		if (!can_coalesce && i >= sysctl_max_skb_frags) {
> +		if ((!can_coalesce && i >= sysctl_max_skb_frags) ||
> +		    !sk_tx_crypto_match(sk, skb)) {

I see that sk_tx_crypto_match is called only here to handle cases where 
the socket expected crypto offload, while the skb is not marked 
decrypted. AFAIU, this should not be possible, because we set the 
skb->eor bit for the last plaintext skb before sending any traffic that 
expects crypto offload.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next] net/tls: prevent skb_orphan() from leaking TLS plain text with offload
  2019-07-30 21:12 [PATCH net-next] net/tls: prevent skb_orphan() from leaking TLS plain text with offload Jakub Kicinski
  2019-07-31 13:57 ` Boris Pismenny
@ 2019-07-31 15:57 ` Willem de Bruijn
  2019-07-31 18:12   ` [oss-drivers] " Jakub Kicinski
  2019-08-03  0:24 ` David Miller
  2 siblings, 1 reply; 8+ messages in thread
From: Willem de Bruijn @ 2019-07-31 15:57 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David Miller, Network Development, oss-drivers, Eric Dumazet,
	borisp, aviadye, davejwatson, John Fastabend, Daniel Borkmann,
	Cong Wang, Florian Westphal, Alexei Starovoitov

On Tue, Jul 30, 2019 at 5:13 PM Jakub Kicinski
<jakub.kicinski@netronome.com> wrote:
>
> sk_validate_xmit_skb() and drivers depend on the sk member of
> struct sk_buff to identify segments requiring encryption.
> Any operation which removes or does not preserve the original TLS
> socket such as skb_orphan() or skb_clone() will cause clear text
> leaks.
>
> Make the TCP socket underlying an offloaded TLS connection
> mark all skbs as decrypted, if TLS TX is in offload mode.
> Then in sk_validate_xmit_skb() catch skbs which have no socket
> (or a socket with no validation) and decrypted flag set.
>
> Note that CONFIG_SOCK_VALIDATE_XMIT, CONFIG_TLS_DEVICE and
> sk->sk_validate_xmit_skb are slightly interchangeable right now,
> they all imply TLS offload. The new checks are guarded by
> CONFIG_TLS_DEVICE because that's the option guarding the
> sk_buff->decrypted member.
>
> Second, smaller issue with orphaning is that it breaks
> the guarantee that packets will be delivered to device
> queues in-order. All TLS offload drivers depend on that
> scheduling property. This means skb_orphan_partial()'s
> trick of preserving partial socket references will cause
> issues in the drivers. We need a full orphan, and as a
> result netem delay/throttling will cause all TLS offload
> skbs to be dropped.
>
> Reusing the sk_buff->decrypted flag also protects from
> leaking clear text when incoming, decrypted skb is redirected
> (e.g. by TC).
>
> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> ---
> I'm sending this for net-next because of lack of confidence
> in my own abilities. It should apply cleanly to net... :)
>
>  Documentation/networking/tls-offload.rst |  9 --------
>  include/net/sock.h                       | 28 +++++++++++++++++++++++-
>  net/core/skbuff.c                        |  3 +++
>  net/core/sock.c                          | 22 ++++++++++++++-----
>  net/ipv4/tcp.c                           |  4 +++-
>  net/ipv4/tcp_output.c                    |  3 +++
>  net/tls/tls_device.c                     |  2 ++
>  7 files changed, 55 insertions(+), 16 deletions(-)

>  Redirects leak clear text
>  -------------------------
>
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 228db3998e46..90f3f552c789 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -814,6 +814,7 @@ enum sock_flags {
>         SOCK_TXTIME,
>         SOCK_XDP, /* XDP is attached */
>         SOCK_TSTAMP_NEW, /* Indicates 64 bit timestamps always */
> +       SOCK_CRYPTO_TX_PLAIN_TEXT, /* Generate skbs with decrypted flag set */
>  };
>
>  #define SK_FLAGS_TIMESTAMP ((1UL << SOCK_TIMESTAMP) | (1UL << SOCK_TIMESTAMPING_RX_SOFTWARE))
> @@ -2480,8 +2481,26 @@ static inline bool sk_fullsock(const struct sock *sk)
>         return (1 << sk->sk_state) & ~(TCPF_TIME_WAIT | TCPF_NEW_SYN_RECV);
>  }
>
> +static inline void sk_set_tx_crypto(const struct sock *sk, struct sk_buff *skb)

nit: skb_.. instead of sk_.. ?

> +{
> +#ifdef CONFIG_TLS_DEVICE
> +       skb->decrypted = sock_flag(sk, SOCK_CRYPTO_TX_PLAIN_TEXT);
> +#endif
> +}

> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 0b788df5a75b..9e92684479b9 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -3794,6 +3794,9 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
>
>                         skb_reserve(nskb, headroom);
>                         __skb_put(nskb, doffset);
> +#ifdef CONFIG_TLS_DEVICE
> +                       nskb->decrypted = head_skb->decrypted;
> +#endif

decrypted is between header_start and headers_end, so
__copy_skb_header just below should take care of this?


> diff --git a/net/core/sock.c b/net/core/sock.c
> index d57b0cc995a0..b0c10b518e65 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -1992,6 +1992,22 @@ void skb_set_owner_w(struct sk_buff *skb, struct sock *sk)
>  }
>  EXPORT_SYMBOL(skb_set_owner_w);
>
> +static bool can_skb_orphan_partial(const struct sk_buff *skb)
> +{
> +#ifdef CONFIG_TLS_DEVICE
> +       /* Drivers depend on in-order delivery for crypto offload,
> +        * partial orphan breaks out-of-order-OK logic.
> +        */
> +       if (skb->decrypted)
> +               return false;
> +#endif
> +#ifdef CONFIG_INET
> +       if (skb->destructor == tcp_wfree)
> +               return true;
> +#endif
> +       return skb->destructor == sock_wfree;
> +}
> +

Just insert the skb->decrypted check into skb_orphan_partial for less
code churn?

I also think that this is an independent concern from leaking plain
text, so perhaps could be a separate patch.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [oss-drivers] Re: [PATCH net-next] net/tls: prevent skb_orphan() from leaking TLS plain text with offload
  2019-07-31 15:57 ` Willem de Bruijn
@ 2019-07-31 18:12   ` Jakub Kicinski
  2019-07-31 19:07     ` Willem de Bruijn
  0 siblings, 1 reply; 8+ messages in thread
From: Jakub Kicinski @ 2019-07-31 18:12 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: David Miller, Network Development, oss-drivers, Eric Dumazet,
	borisp, aviadye, davejwatson, John Fastabend, Daniel Borkmann,
	Cong Wang, Florian Westphal, Alexei Starovoitov

On Wed, 31 Jul 2019 11:57:10 -0400, Willem de Bruijn wrote:
> On Tue, Jul 30, 2019 at 5:13 PM Jakub Kicinski wrote:
> > sk_validate_xmit_skb() and drivers depend on the sk member of
> > struct sk_buff to identify segments requiring encryption.
> > Any operation which removes or does not preserve the original TLS
> > socket such as skb_orphan() or skb_clone() will cause clear text
> > leaks.
> >
> > Make the TCP socket underlying an offloaded TLS connection
> > mark all skbs as decrypted, if TLS TX is in offload mode.
> > Then in sk_validate_xmit_skb() catch skbs which have no socket
> > (or a socket with no validation) and decrypted flag set.
> >
> > Note that CONFIG_SOCK_VALIDATE_XMIT, CONFIG_TLS_DEVICE and
> > sk->sk_validate_xmit_skb are slightly interchangeable right now,
> > they all imply TLS offload. The new checks are guarded by
> > CONFIG_TLS_DEVICE because that's the option guarding the
> > sk_buff->decrypted member.
> >
> > Second, smaller issue with orphaning is that it breaks
> > the guarantee that packets will be delivered to device
> > queues in-order. All TLS offload drivers depend on that
> > scheduling property. This means skb_orphan_partial()'s
> > trick of preserving partial socket references will cause
> > issues in the drivers. We need a full orphan, and as a
> > result netem delay/throttling will cause all TLS offload
> > skbs to be dropped.
> >
> > Reusing the sk_buff->decrypted flag also protects from
> > leaking clear text when incoming, decrypted skb is redirected
> > (e.g. by TC).
> >
> > Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>

> > diff --git a/net/core/sock.c b/net/core/sock.c
> > index d57b0cc995a0..b0c10b518e65 100644
> > --- a/net/core/sock.c
> > +++ b/net/core/sock.c
> > @@ -1992,6 +1992,22 @@ void skb_set_owner_w(struct sk_buff *skb, struct sock *sk)
> >  }
> >  EXPORT_SYMBOL(skb_set_owner_w);
> >
> > +static bool can_skb_orphan_partial(const struct sk_buff *skb)
> > +{
> > +#ifdef CONFIG_TLS_DEVICE
> > +       /* Drivers depend on in-order delivery for crypto offload,
> > +        * partial orphan breaks out-of-order-OK logic.
> > +        */
> > +       if (skb->decrypted)
> > +               return false;
> > +#endif
> > +#ifdef CONFIG_INET
> > +       if (skb->destructor == tcp_wfree)
> > +               return true;
> > +#endif
> > +       return skb->destructor == sock_wfree;
> > +}
> > +  
> 
> Just insert the skb->decrypted check into skb_orphan_partial for less
> code churn?

Okie.. skb_orphan_partial() is a little ugly but will do.

> I also think that this is an independent concern from leaking plain
> text, so perhaps could be a separate patch.

Do you mean the out-of-order stuff is a separate concern?

It is, I had them separate at the first try, but GSO code looks at
the destructor and IIRC only copies the socket if its still tcp_wfree.
If we let partial orphan be we have to do temporary hairy stuff in
tcp_gso_segment(). It's easier to just deal with partial orphan here.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next] net/tls: prevent skb_orphan() from leaking TLS plain text with offload
  2019-07-31 13:57 ` Boris Pismenny
@ 2019-07-31 18:43   ` Jakub Kicinski
  0 siblings, 0 replies; 8+ messages in thread
From: Jakub Kicinski @ 2019-07-31 18:43 UTC (permalink / raw)
  To: Boris Pismenny
  Cc: davem, netdev, oss-drivers, edumazet, Aviad Yehezkel,
	davejwatson, john.fastabend, daniel, willemb, xiyou.wangcong, fw,
	alexei.starovoitov

On Wed, 31 Jul 2019 13:57:26 +0000, Boris Pismenny wrote:
> > diff --git a/Documentation/networking/tls-offload.rst b/Documentation/networking/tls-offload.rst
> > index 048e5ca44824..2bc3ab5515d8 100644
> > --- a/Documentation/networking/tls-offload.rst
> > +++ b/Documentation/networking/tls-offload.rst
> > @@ -499,15 +499,6 @@ offloads, old connections will remain active after flags are cleared.
> >   Known bugs
> >   ==========
> >   
> > -skb_orphan() leaks clear text
> > ------------------------------
> > -
> > -Currently drivers depend on the :c:member:`sk` member of
> > -:c:type:`struct sk_buff <sk_buff>` to identify segments requiring
> > -encryption. Any operation which removes or does not preserve the socket
> > -association such as :c:func:`skb_orphan` or :c:func:`skb_clone`
> > -will cause the driver to miss the packets and lead to clear text leaks.
> > -
> >   Redirects leak clear text
> >   -------------------------  
> Doesn't this patch cover the redirect case as well?

Ah, good catch! I thought this entry was for socket redirect, which
will be a separate fix, but it seems we don't have an entry for that
one. 

> > diff --git a/include/net/sock.h b/include/net/sock.h
> > index 228db3998e46..90f3f552c789 100644
> > --- a/include/net/sock.h
> > +++ b/include/net/sock.h
> > @@ -814,6 +814,7 @@ enum sock_flags {
> >   	SOCK_TXTIME,
> >   	SOCK_XDP, /* XDP is attached */
> >   	SOCK_TSTAMP_NEW, /* Indicates 64 bit timestamps always */
> > +	SOCK_CRYPTO_TX_PLAIN_TEXT, /* Generate skbs with decrypted flag set */
> >   };
> >   
> >   #define SK_FLAGS_TIMESTAMP ((1UL << SOCK_TIMESTAMP) | (1UL << SOCK_TIMESTAMPING_RX_SOFTWARE))
> > @@ -2480,8 +2481,26 @@ static inline bool sk_fullsock(const struct sock *sk)
> >   	return (1 << sk->sk_state) & ~(TCPF_TIME_WAIT | TCPF_NEW_SYN_RECV);
> >   }
> >   
> > +static inline void sk_set_tx_crypto(const struct sock *sk, struct sk_buff *skb)
> > +{
> > +#ifdef CONFIG_TLS_DEVICE
> > +	skb->decrypted = sock_flag(sk, SOCK_CRYPTO_TX_PLAIN_TEXT);
> > +#endif
> > +}  
> 
> Have you considered the following alternative to calling 
> sk_set_tx_crypto() - Add a new MSG_DECRYPTE flag that will set the 
> skb->decrypted bit in the do_tcp_sendpages function.
> 
> Then, the rest of the TCP code can propagate this bit from the existing skb.

That'd also work marking the socket as crypto seemed easy enough. I was
planning on using sk_rx_crypto_match() for socket redirect also, so
it'd be symmetrical. Is there a preference for using the internal flags?

> > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> > index f62f0e7e3cdd..dee93eab02fe 100644
> > --- a/net/ipv4/tcp.c
> > +++ b/net/ipv4/tcp.c
> > @@ -984,6 +984,7 @@ ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset,
> >   			if (!skb)
> >   				goto wait_for_memory;
> >   
> > +			sk_set_tx_crypto(sk, skb);
> >   			skb_entail(sk, skb);
> >   			copy = size_goal;
> >   		}
> > @@ -993,7 +994,8 @@ ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset,
> >   
> >   		i = skb_shinfo(skb)->nr_frags;
> >   		can_coalesce = skb_can_coalesce(skb, i, page, offset);
> > -		if (!can_coalesce && i >= sysctl_max_skb_frags) {
> > +		if ((!can_coalesce && i >= sysctl_max_skb_frags) ||
> > +		    !sk_tx_crypto_match(sk, skb)) {  
> 
> I see that sk_tx_crypto_match is called only here to handle cases where 
> the socket expected crypto offload, while the skb is not marked 
> decrypted. AFAIU, this should not be possible, because we set the 
> skb->eor bit for the last plaintext skb before sending any traffic that 
> expects crypto offload.

Ack, I missed the tcp_skb_can_collapse_to() above.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [oss-drivers] Re: [PATCH net-next] net/tls: prevent skb_orphan() from leaking TLS plain text with offload
  2019-07-31 18:12   ` [oss-drivers] " Jakub Kicinski
@ 2019-07-31 19:07     ` Willem de Bruijn
  0 siblings, 0 replies; 8+ messages in thread
From: Willem de Bruijn @ 2019-07-31 19:07 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David Miller, Network Development, oss-drivers, Eric Dumazet,
	borisp, aviadye, davejwatson, John Fastabend, Daniel Borkmann,
	Cong Wang, Florian Westphal, Alexei Starovoitov

On Wed, Jul 31, 2019 at 2:12 PM Jakub Kicinski
<jakub.kicinski@netronome.com> wrote:
>
> On Wed, 31 Jul 2019 11:57:10 -0400, Willem de Bruijn wrote:
> > On Tue, Jul 30, 2019 at 5:13 PM Jakub Kicinski wrote:
> > > sk_validate_xmit_skb() and drivers depend on the sk member of
> > > struct sk_buff to identify segments requiring encryption.
> > > Any operation which removes or does not preserve the original TLS
> > > socket such as skb_orphan() or skb_clone() will cause clear text
> > > leaks.
> > >
> > > Make the TCP socket underlying an offloaded TLS connection
> > > mark all skbs as decrypted, if TLS TX is in offload mode.
> > > Then in sk_validate_xmit_skb() catch skbs which have no socket
> > > (or a socket with no validation) and decrypted flag set.
> > >
> > > Note that CONFIG_SOCK_VALIDATE_XMIT, CONFIG_TLS_DEVICE and
> > > sk->sk_validate_xmit_skb are slightly interchangeable right now,
> > > they all imply TLS offload. The new checks are guarded by
> > > CONFIG_TLS_DEVICE because that's the option guarding the
> > > sk_buff->decrypted member.
> > >
> > > Second, smaller issue with orphaning is that it breaks
> > > the guarantee that packets will be delivered to device
> > > queues in-order. All TLS offload drivers depend on that
> > > scheduling property. This means skb_orphan_partial()'s
> > > trick of preserving partial socket references will cause
> > > issues in the drivers. We need a full orphan, and as a
> > > result netem delay/throttling will cause all TLS offload
> > > skbs to be dropped.
> > >
> > > Reusing the sk_buff->decrypted flag also protects from
> > > leaking clear text when incoming, decrypted skb is redirected
> > > (e.g. by TC).
> > >
> > > Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
>
> > > diff --git a/net/core/sock.c b/net/core/sock.c
> > > index d57b0cc995a0..b0c10b518e65 100644
> > > --- a/net/core/sock.c
> > > +++ b/net/core/sock.c
> > > @@ -1992,6 +1992,22 @@ void skb_set_owner_w(struct sk_buff *skb, struct sock *sk)
> > >  }
> > >  EXPORT_SYMBOL(skb_set_owner_w);
> > >
> > > +static bool can_skb_orphan_partial(const struct sk_buff *skb)
> > > +{
> > > +#ifdef CONFIG_TLS_DEVICE
> > > +       /* Drivers depend on in-order delivery for crypto offload,
> > > +        * partial orphan breaks out-of-order-OK logic.
> > > +        */
> > > +       if (skb->decrypted)
> > > +               return false;
> > > +#endif
> > > +#ifdef CONFIG_INET
> > > +       if (skb->destructor == tcp_wfree)
> > > +               return true;
> > > +#endif
> > > +       return skb->destructor == sock_wfree;
> > > +}
> > > +
> >
> > Just insert the skb->decrypted check into skb_orphan_partial for less
> > code churn?
>
> Okie.. skb_orphan_partial() is a little ugly but will do.
>
> > I also think that this is an independent concern from leaking plain
> > text, so perhaps could be a separate patch.

Just a suggestion and very much depending on how much uglier it
becomes otherwise ;)

> Do you mean the out-of-order stuff is a separate concern?
>
> It is, I had them separate at the first try, but GSO code looks at
> the destructor and IIRC only copies the socket if its still tcp_wfree.
> If we let partial orphan be we have to do temporary hairy stuff in
> tcp_gso_segment(). It's easier to just deal with partial orphan here.

Okay, sounds good.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next] net/tls: prevent skb_orphan() from leaking TLS plain text with offload
  2019-07-30 21:12 [PATCH net-next] net/tls: prevent skb_orphan() from leaking TLS plain text with offload Jakub Kicinski
  2019-07-31 13:57 ` Boris Pismenny
  2019-07-31 15:57 ` Willem de Bruijn
@ 2019-08-03  0:24 ` David Miller
  2019-08-04  1:58   ` Jakub Kicinski
  2 siblings, 1 reply; 8+ messages in thread
From: David Miller @ 2019-08-03  0:24 UTC (permalink / raw)
  To: jakub.kicinski
  Cc: netdev, oss-drivers, edumazet, borisp, aviadye, davejwatson,
	john.fastabend, daniel, willemb, xiyou.wangcong, fw,
	alexei.starovoitov

From: Jakub Kicinski <jakub.kicinski@netronome.com>
Date: Tue, 30 Jul 2019 14:12:58 -0700

> I'm sending this for net-next because of lack of confidence
> in my own abilities. It should apply cleanly to net... :)

It looks like there will be changes to this.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next] net/tls: prevent skb_orphan() from leaking TLS plain text with offload
  2019-08-03  0:24 ` David Miller
@ 2019-08-04  1:58   ` Jakub Kicinski
  0 siblings, 0 replies; 8+ messages in thread
From: Jakub Kicinski @ 2019-08-04  1:58 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, oss-drivers, edumazet, borisp, aviadye, davejwatson,
	john.fastabend, daniel, willemb, xiyou.wangcong, fw,
	alexei.starovoitov

On Fri, 02 Aug 2019 17:24:53 -0700 (PDT), David Miller wrote:
> From: Jakub Kicinski <jakub.kicinski@netronome.com>
> Date: Tue, 30 Jul 2019 14:12:58 -0700
> 
> > I'm sending this for net-next because of lack of confidence
> > in my own abilities. It should apply cleanly to net... :)  
> 
> It looks like there will be changes to this.

Yes, sorry for going silent, I reworked this to follow Boris'es
suggestion of using flags, and wanted a little bit of extra QA.
Unfortunately 5.3-rc has broken Intel IOMMU and QA lab machines
don't boot.. :(

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-08-04  1:58 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-30 21:12 [PATCH net-next] net/tls: prevent skb_orphan() from leaking TLS plain text with offload Jakub Kicinski
2019-07-31 13:57 ` Boris Pismenny
2019-07-31 18:43   ` Jakub Kicinski
2019-07-31 15:57 ` Willem de Bruijn
2019-07-31 18:12   ` [oss-drivers] " Jakub Kicinski
2019-07-31 19:07     ` Willem de Bruijn
2019-08-03  0:24 ` David Miller
2019-08-04  1:58   ` Jakub Kicinski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.