All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 net-next 0/6] tls: Add generic NIC offload infrastructure
@ 2017-12-18 11:10 Ilya Lesokhin
  2017-12-18 11:10 ` [PATCH v3 net-next 1/6] tcp: Add clean acked data hook Ilya Lesokhin
                   ` (6 more replies)
  0 siblings, 7 replies; 35+ messages in thread
From: Ilya Lesokhin @ 2017-12-18 11:10 UTC (permalink / raw)
  To: netdev, davem
  Cc: davejwatson, tom, hannes, borisp, aviadye, liranl, Ilya Lesokhin

Changes from v2:
- Fix sk use after free and possible netdev use after free
- tls device now keeps a refernce on the offloading netdev
- tls device registers to the netdev notifer. 
  Upon a NETDEV_DOWN event, offload is stopped and
  the reference on the netdev is dropped.
- SW fallback support for skb->ip_summed != CHECKSUM_PARTIAL 
- Merged TLS patches are no longer part of this series.

Changes from v1:
- Remove the binding of the socket to a specific netdev 
  through sk->sk_bound_dev_if.
  Add a check in validate_xmit_skb to detect route changes
  and call SW fallback code to do the crypto in software.
- tls_get_record now returns the tls record sequence number.
  This is required to support connections with rcd_sn != iv.
- Bug fixes to the TLS code.

This patchset adds a generic infrastructure to offload TLS crypto to a
network devices.

patches 1-2 Export functions that we need
patch 3 adds infrastructue for offloaded socket fallback
patches 4-5 add new NDOs and capabilities.
patch 6 adds the TLS NIC offload infrastructure.

Github with mlx5e TLS offload support:
https://github.com/Mellanox/tls-offload/tree/tls_device_v3

Paper: https://www.netdevconf.org/1.2/papers/netdevconf-TLS.pdf

Ilya Lesokhin (6):
  tcp: Add clean acked data hook
  net: Rename and export copy_skb_header
  net: Add SW fallback infrastructure for offloaded sockets
  net: Add TLS offload netdev ops
  net: Add TLS TX offload features
  tls: Add generic NIC offload infrastructure.

 include/linux/netdev_features.h    |   2 +
 include/linux/netdevice.h          |  23 ++
 include/linux/skbuff.h             |   1 +
 include/net/inet_connection_sock.h |   2 +
 include/net/sock.h                 |  17 +
 include/net/tls.h                  |  62 ++-
 net/core/dev.c                     |   4 +
 net/core/ethtool.c                 |   1 +
 net/core/skbuff.c                  |   9 +-
 net/ipv4/tcp_input.c               |   3 +
 net/tls/Kconfig                    |   9 +
 net/tls/Makefile                   |   3 +
 net/tls/tls_device.c               | 800 +++++++++++++++++++++++++++++++++++++
 net/tls/tls_device_fallback.c      | 405 +++++++++++++++++++
 net/tls/tls_main.c                 |  33 +-
 15 files changed, 1363 insertions(+), 11 deletions(-)
 create mode 100644 net/tls/tls_device.c
 create mode 100644 net/tls/tls_device_fallback.c

-- 
2.15.0.317.g14c63a9

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v3 net-next 1/6] tcp: Add clean acked data hook
  2017-12-18 11:10 [PATCH v3 net-next 0/6] tls: Add generic NIC offload infrastructure Ilya Lesokhin
@ 2017-12-18 11:10 ` Ilya Lesokhin
  2017-12-19 19:13   ` Eric Dumazet
  2017-12-18 11:10 ` [PATCH v3 net-next 2/6] net: Rename and export copy_skb_header Ilya Lesokhin
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 35+ messages in thread
From: Ilya Lesokhin @ 2017-12-18 11:10 UTC (permalink / raw)
  To: netdev, davem
  Cc: davejwatson, tom, hannes, borisp, aviadye, liranl, Ilya Lesokhin

Called when a TCP segment is acknowledged.
Could be used by application protocols who hold additional
metadata associated with the stream data
This is required by TLS device offload to release
metadata associated with acknowledged TLS records.

Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
---
 include/net/inet_connection_sock.h | 2 ++
 net/ipv4/tcp_input.c               | 3 +++
 2 files changed, 5 insertions(+)

diff --git a/include/net/inet_connection_sock.h b/include/net/inet_connection_sock.h
index 8e1bf9ae4a5e..ec405a667a85 100644
--- a/include/net/inet_connection_sock.h
+++ b/include/net/inet_connection_sock.h
@@ -77,6 +77,7 @@ struct inet_connection_sock_af_ops {
  * @icsk_af_ops		   Operations which are AF_INET{4,6} specific
  * @icsk_ulp_ops	   Pluggable ULP control hook
  * @icsk_ulp_data	   ULP private data
+ * @icsk_clean_acked	   Clean acked data hook
  * @icsk_listen_portaddr_node	hash to the portaddr listener hashtable
  * @icsk_ca_state:	   Congestion control state
  * @icsk_retransmits:	   Number of unrecovered [RTO] timeouts
@@ -102,6 +103,7 @@ struct inet_connection_sock {
 	const struct inet_connection_sock_af_ops *icsk_af_ops;
 	const struct tcp_ulp_ops  *icsk_ulp_ops;
 	void			  *icsk_ulp_data;
+	void			  (*icsk_clean_acked)(struct sock *sk);
 	struct hlist_node         icsk_listen_portaddr_node;
 	unsigned int		  (*icsk_sync_mss)(struct sock *sk, u32 pmtu);
 	__u8			  icsk_ca_state:6,
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 4d55c4b338ee..961abc5be84c 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -3592,6 +3592,9 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag)
 	if (!prior_packets)
 		goto no_queue;
 
+	if (icsk->icsk_clean_acked)
+		icsk->icsk_clean_acked(sk);
+
 	/* See if we can take anything off of the retransmit queue. */
 	flag |= tcp_clean_rtx_queue(sk, prior_fack, prior_snd_una, &sack_state);
 
-- 
2.15.0.317.g14c63a9

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 net-next 2/6] net: Rename and export copy_skb_header
  2017-12-18 11:10 [PATCH v3 net-next 0/6] tls: Add generic NIC offload infrastructure Ilya Lesokhin
  2017-12-18 11:10 ` [PATCH v3 net-next 1/6] tcp: Add clean acked data hook Ilya Lesokhin
@ 2017-12-18 11:10 ` Ilya Lesokhin
  2017-12-18 11:10 ` [PATCH v3 net-next 3/6] net: Add SW fallback infrastructure for offloaded sockets Ilya Lesokhin
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 35+ messages in thread
From: Ilya Lesokhin @ 2017-12-18 11:10 UTC (permalink / raw)
  To: netdev, davem
  Cc: davejwatson, tom, hannes, borisp, aviadye, liranl, Ilya Lesokhin

copy_skb_header is renamed to skb_copy_header and
exported. Exposing this function give more flexibility
in copying SKBs.
skb_copy and skb_copy_expand do not give enough control
over which parts are copied.

Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
---
 include/linux/skbuff.h | 1 +
 net/core/skbuff.c      | 9 +++++----
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index b8e0da6c27d6..1857ec77e89b 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1028,6 +1028,7 @@ static inline struct sk_buff *alloc_skb_fclone(unsigned int size,
 struct sk_buff *skb_morph(struct sk_buff *dst, struct sk_buff *src);
 int skb_copy_ubufs(struct sk_buff *skb, gfp_t gfp_mask);
 struct sk_buff *skb_clone(struct sk_buff *skb, gfp_t priority);
+void skb_copy_header(struct sk_buff *new, const struct sk_buff *old);
 struct sk_buff *skb_copy(const struct sk_buff *skb, gfp_t priority);
 struct sk_buff *__pskb_copy_fclone(struct sk_buff *skb, int headroom,
 				   gfp_t gfp_mask, bool fclone);
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index a592ca025fc4..c318cdf67a83 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1301,7 +1301,7 @@ static void skb_headers_offset_update(struct sk_buff *skb, int off)
 	skb->inner_mac_header += off;
 }
 
-static void copy_skb_header(struct sk_buff *new, const struct sk_buff *old)
+void skb_copy_header(struct sk_buff *new, const struct sk_buff *old)
 {
 	__copy_skb_header(new, old);
 
@@ -1309,6 +1309,7 @@ static void copy_skb_header(struct sk_buff *new, const struct sk_buff *old)
 	skb_shinfo(new)->gso_segs = skb_shinfo(old)->gso_segs;
 	skb_shinfo(new)->gso_type = skb_shinfo(old)->gso_type;
 }
+EXPORT_SYMBOL(skb_copy_header);
 
 static inline int skb_alloc_rx_flag(const struct sk_buff *skb)
 {
@@ -1351,7 +1352,7 @@ struct sk_buff *skb_copy(const struct sk_buff *skb, gfp_t gfp_mask)
 
 	BUG_ON(skb_copy_bits(skb, -headerlen, n->head, headerlen + skb->len));
 
-	copy_skb_header(n, skb);
+	skb_copy_header(n, skb);
 	return n;
 }
 EXPORT_SYMBOL(skb_copy);
@@ -1415,7 +1416,7 @@ struct sk_buff *__pskb_copy_fclone(struct sk_buff *skb, int headroom,
 		skb_clone_fraglist(n);
 	}
 
-	copy_skb_header(n, skb);
+	skb_copy_header(n, skb);
 out:
 	return n;
 }
@@ -1595,7 +1596,7 @@ struct sk_buff *skb_copy_expand(const struct sk_buff *skb,
 	BUG_ON(skb_copy_bits(skb, -head_copy_len, n->head + head_copy_off,
 			     skb->len + head_copy_len));
 
-	copy_skb_header(n, skb);
+	skb_copy_header(n, skb);
 
 	skb_headers_offset_update(n, newheadroom - oldheadroom);
 
-- 
2.15.0.317.g14c63a9

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 net-next 3/6] net: Add SW fallback infrastructure for offloaded sockets
  2017-12-18 11:10 [PATCH v3 net-next 0/6] tls: Add generic NIC offload infrastructure Ilya Lesokhin
  2017-12-18 11:10 ` [PATCH v3 net-next 1/6] tcp: Add clean acked data hook Ilya Lesokhin
  2017-12-18 11:10 ` [PATCH v3 net-next 2/6] net: Rename and export copy_skb_header Ilya Lesokhin
@ 2017-12-18 11:10 ` Ilya Lesokhin
  2017-12-18 19:18   ` Marcelo Ricardo Leitner
  2017-12-19 19:12   ` Eric Dumazet
  2017-12-18 11:10 ` [PATCH v3 net-next 4/6] net: Add TLS offload netdev ops Ilya Lesokhin
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 35+ messages in thread
From: Ilya Lesokhin @ 2017-12-18 11:10 UTC (permalink / raw)
  To: netdev, davem
  Cc: davejwatson, tom, hannes, borisp, aviadye, liranl, Ilya Lesokhin

Offloaded sockets rely on the netdev to transform the transmitted
packets before sending them over the network.
When a packet from an offloaded socket is looped back or
rerouted to a different device we need to detect it and
do the transformation in software

Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
---
 include/net/sock.h | 17 +++++++++++++++++
 net/core/dev.c     |  4 ++++
 2 files changed, 21 insertions(+)

diff --git a/include/net/sock.h b/include/net/sock.h
index 9a9047268d37..5397307603ec 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -479,6 +479,9 @@ struct sock {
 	void			(*sk_error_report)(struct sock *sk);
 	int			(*sk_backlog_rcv)(struct sock *sk,
 						  struct sk_buff *skb);
+	struct sk_buff*		(*sk_offload_check)(struct sock *sk,
+						    struct net_device *dev,
+						    struct sk_buff *skb);
 	void                    (*sk_destruct)(struct sock *sk);
 	struct sock_reuseport __rcu	*sk_reuseport_cb;
 	struct rcu_head		sk_rcu;
@@ -2324,6 +2327,20 @@ static inline bool sk_fullsock(const struct sock *sk)
 	return (1 << sk->sk_state) & ~(TCPF_TIME_WAIT | TCPF_NEW_SYN_RECV);
 }
 
+/* Checks if this SKB belongs to an HW offloaded socket
+ * and whether any SW fallbacks are required based on dev.
+ */
+static inline struct sk_buff *skb_offload_check(struct sk_buff *skb,
+						struct net_device *dev)
+{
+	struct sock *sk = skb->sk;
+
+	if (sk && sk_fullsock(sk) && sk->sk_offload_check)
+		skb = sk->sk_offload_check(sk, dev, skb);
+
+	return skb;
+}
+
 /* This helper checks if a socket is a LISTEN or NEW_SYN_RECV
  * SYNACK messages can be attached to either ones (depending on SYNCOOKIE)
  */
diff --git a/net/core/dev.c b/net/core/dev.c
index b0eee49a2489..6a78d9046674 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3051,6 +3051,10 @@ static struct sk_buff *validate_xmit_skb(struct sk_buff *skb, struct net_device
 	if (unlikely(!skb))
 		goto out_null;
 
+	skb = skb_offload_check(skb, dev);
+	if (!skb)
+		goto out_null;
+
 	if (netif_needs_gso(skb, features)) {
 		struct sk_buff *segs;
 
-- 
2.15.0.317.g14c63a9

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 net-next 4/6] net: Add TLS offload netdev ops
  2017-12-18 11:10 [PATCH v3 net-next 0/6] tls: Add generic NIC offload infrastructure Ilya Lesokhin
                   ` (2 preceding siblings ...)
  2017-12-18 11:10 ` [PATCH v3 net-next 3/6] net: Add SW fallback infrastructure for offloaded sockets Ilya Lesokhin
@ 2017-12-18 11:10 ` Ilya Lesokhin
  2017-12-18 11:10 ` [PATCH v3 net-next 5/6] net: Add TLS TX offload features Ilya Lesokhin
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 35+ messages in thread
From: Ilya Lesokhin @ 2017-12-18 11:10 UTC (permalink / raw)
  To: netdev, davem
  Cc: davejwatson, tom, hannes, borisp, aviadye, liranl, Ilya Lesokhin

Add new netdev ops to add and delete tls context

Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
---
 include/linux/netdevice.h | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index cc4ce7456e38..48f26fe67b82 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -845,6 +845,25 @@ struct xfrmdev_ops {
 };
 #endif
 
+#if IS_ENABLED(CONFIG_TLS_DEVICE)
+enum tls_offload_ctx_dir {
+	TLS_OFFLOAD_CTX_DIR_RX,
+	TLS_OFFLOAD_CTX_DIR_TX,
+};
+
+struct tls_crypto_info;
+struct tls_context;
+
+struct tlsdev_ops {
+	int (*tls_dev_add)(struct net_device *netdev, struct sock *sk,
+			   enum tls_offload_ctx_dir direction,
+			   struct tls_crypto_info *crypto_info);
+	void (*tls_dev_del)(struct net_device *netdev,
+			    struct tls_context *ctx,
+			    enum tls_offload_ctx_dir direction);
+};
+#endif
+
 struct dev_ifalias {
 	struct rcu_head rcuhead;
 	char ifalias[];
@@ -1730,6 +1749,10 @@ struct net_device {
 	const struct xfrmdev_ops *xfrmdev_ops;
 #endif
 
+#if IS_ENABLED(CONFIG_TLS_DEVICE)
+	const struct tlsdev_ops *tlsdev_ops;
+#endif
+
 	const struct header_ops *header_ops;
 
 	unsigned int		flags;
-- 
2.15.0.317.g14c63a9

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 net-next 5/6] net: Add TLS TX offload features
  2017-12-18 11:10 [PATCH v3 net-next 0/6] tls: Add generic NIC offload infrastructure Ilya Lesokhin
                   ` (3 preceding siblings ...)
  2017-12-18 11:10 ` [PATCH v3 net-next 4/6] net: Add TLS offload netdev ops Ilya Lesokhin
@ 2017-12-18 11:10 ` Ilya Lesokhin
  2017-12-18 11:10 ` [PATCH v3 net-next 6/6] tls: Add generic NIC offload infrastructure Ilya Lesokhin
  2017-12-18 17:10 ` [PATCH v3 net-next 0/6] " Jiri Pirko
  6 siblings, 0 replies; 35+ messages in thread
From: Ilya Lesokhin @ 2017-12-18 11:10 UTC (permalink / raw)
  To: netdev, davem
  Cc: davejwatson, tom, hannes, borisp, aviadye, liranl, Ilya Lesokhin

This patch adds a netdev feature to configure TLS TX offloads.

Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
---
 include/linux/netdev_features.h | 2 ++
 net/core/ethtool.c              | 1 +
 2 files changed, 3 insertions(+)

diff --git a/include/linux/netdev_features.h b/include/linux/netdev_features.h
index b1b0ca7ccb2b..b219776a1340 100644
--- a/include/linux/netdev_features.h
+++ b/include/linux/netdev_features.h
@@ -77,6 +77,7 @@ enum {
 	NETIF_F_HW_ESP_BIT,		/* Hardware ESP transformation offload */
 	NETIF_F_HW_ESP_TX_CSUM_BIT,	/* ESP with TX checksum offload */
 	NETIF_F_RX_UDP_TUNNEL_PORT_BIT, /* Offload of RX port for UDP tunnels */
+	NETIF_F_HW_TLS_TX_BIT,		/* Hardware TLS TX offload */
 
 	/*
 	 * Add your fresh new feature above and remember to update
@@ -142,6 +143,7 @@ enum {
 #define NETIF_F_HW_ESP		__NETIF_F(HW_ESP)
 #define NETIF_F_HW_ESP_TX_CSUM	__NETIF_F(HW_ESP_TX_CSUM)
 #define	NETIF_F_RX_UDP_TUNNEL_PORT  __NETIF_F(RX_UDP_TUNNEL_PORT)
+#define NETIF_F_HW_TLS_TX	__NETIF_F(HW_TLS_TX)
 
 #define for_each_netdev_feature(mask_addr, bit)	\
 	for_each_set_bit(bit, (unsigned long *)mask_addr, NETDEV_FEATURE_COUNT)
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index f8fcf450a36e..a1138d2dafa6 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -106,6 +106,7 @@ static const char netdev_features_strings[NETDEV_FEATURE_COUNT][ETH_GSTRING_LEN]
 	[NETIF_F_HW_ESP_BIT] =		 "esp-hw-offload",
 	[NETIF_F_HW_ESP_TX_CSUM_BIT] =	 "esp-tx-csum-hw-offload",
 	[NETIF_F_RX_UDP_TUNNEL_PORT_BIT] =	 "rx-udp_tunnel-port-offload",
+	[NETIF_F_HW_TLS_TX_BIT] =	 "tls-hw-tx-offload",
 };
 
 static const char
-- 
2.15.0.317.g14c63a9

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 net-next 6/6] tls: Add generic NIC offload infrastructure.
  2017-12-18 11:10 [PATCH v3 net-next 0/6] tls: Add generic NIC offload infrastructure Ilya Lesokhin
                   ` (4 preceding siblings ...)
  2017-12-18 11:10 ` [PATCH v3 net-next 5/6] net: Add TLS TX offload features Ilya Lesokhin
@ 2017-12-18 11:10 ` Ilya Lesokhin
  2017-12-18 19:53   ` Marcelo Ricardo Leitner
                     ` (4 more replies)
  2017-12-18 17:10 ` [PATCH v3 net-next 0/6] " Jiri Pirko
  6 siblings, 5 replies; 35+ messages in thread
From: Ilya Lesokhin @ 2017-12-18 11:10 UTC (permalink / raw)
  To: netdev, davem
  Cc: davejwatson, tom, hannes, borisp, aviadye, liranl, Ilya Lesokhin

This patch adds a generic infrastructure to offload TLS crypto to a
network devices. It enables the kernel TLS socket to skip encryption
and authentication operations on the transmit side of the data path.
Leaving those computationally expensive operations to the NIC.

The NIC offload infrastructure builds TLS records and pushes them to
the TCP layer just like the SW KTLS implementation and using the same API.
TCP segmentation is mostly unaffected. Currently the only exception is
that we prevent mixed SKBs where only part of the payload requires
offload. In the future we are likely to add a similar restriction
following a change cipher spec record.

The notable differences between SW KTLS and NIC offloaded TLS
implementations are as follows:
1. The offloaded implementation builds "plaintext TLS record", those
records contain plaintext instead of ciphertext and place holder bytes
instead of authentication tags.
2. The offloaded implementation maintains a mapping from TCP sequence
number to TLS records. Thus given a TCP SKB sent from a NIC offloaded
TLS socket, we can use the tls NIC offload infrastructure to obtain
enough context to encrypt the payload of the SKB.
A TLS record is released when the last byte of the record is ack'ed,
this is done through the new icsk_clean_acked callback.

The infrastructure should be extendable to support various NIC offload
implementations.  However it is currently written with the
implementation below in mind:
The NIC assumes that packets from each offloaded stream are sent as
plaintext and in-order. It keeps track of the TLS records in the TCP
stream. When a packet marked for offload is transmitted, the NIC
encrypts the payload in-place and puts authentication tags in the
relevant place holders.

The responsibility for handling out-of-order packets (i.e. TCP
retransmission, qdisc drops) falls on the netdev driver.

The netdev driver keeps track of the expected TCP SN from the NIC's
perspective.  If the next packet to transmit matches the expected TCP
SN, the driver advances the expected TCP SN, and transmits the packet
with TLS offload indication.

If the next packet to transmit does not match the expected TCP SN. The
driver calls the TLS layer to obtain the TLS record that includes the
TCP of the packet for transmission. Using this TLS record, the driver
posts a work entry on the transmit queue to reconstruct the NIC TLS
state required for the offload of the out-of-order packet. It updates
the expected TCP SN accordingly and transmit the now in-order packet.
The same queue is used for packet transmission and TLS context
reconstruction to avoid the need for flushing the transmit queue before
issuing the context reconstruction request.

Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
---
 include/net/tls.h             |  62 +++-
 net/tls/Kconfig               |   9 +
 net/tls/Makefile              |   3 +
 net/tls/tls_device.c          | 800 ++++++++++++++++++++++++++++++++++++++++++
 net/tls/tls_device_fallback.c | 405 +++++++++++++++++++++
 net/tls/tls_main.c            |  33 +-
 6 files changed, 1305 insertions(+), 7 deletions(-)
 create mode 100644 net/tls/tls_device.c
 create mode 100644 net/tls/tls_device_fallback.c

diff --git a/include/net/tls.h b/include/net/tls.h
index 936cfc5cab7d..9c1b5d13d9a7 100644
--- a/include/net/tls.h
+++ b/include/net/tls.h
@@ -75,6 +75,29 @@ struct tls_sw_context {
 	struct scatterlist sg_aead_out[2];
 };
 
+struct tls_record_info {
+	struct list_head list;
+	u32 end_seq;
+	int len;
+	int num_frags;
+	skb_frag_t frags[MAX_SKB_FRAGS];
+};
+
+struct tls_offload_context {
+	struct crypto_aead *aead_send;
+
+	struct list_head records_list;
+	struct scatterlist sg_tx_data[MAX_SKB_FRAGS];
+	void (*sk_destruct)(struct sock *sk);
+	struct tls_record_info *open_record;
+	struct tls_record_info *retransmit_hint;
+	u64 hint_record_sn;
+	u64 unacked_record_sn;
+
+	u32 expected_seq;
+	spinlock_t lock;	/* protects records list */
+};
+
 enum {
 	TLS_PENDING_CLOSED_RECORD
 };
@@ -85,6 +108,10 @@ struct tls_context {
 		struct tls12_crypto_info_aes_gcm_128 crypto_send_aes_gcm_128;
 	};
 
+	struct list_head list;
+	struct net_device *netdev;
+	refcount_t refcount;
+
 	void *priv_ctx;
 
 	u8 tx_conf:2;
@@ -129,9 +156,29 @@ int tls_sw_sendpage(struct sock *sk, struct page *page,
 void tls_sw_close(struct sock *sk, long timeout);
 void tls_sw_free_tx_resources(struct sock *sk);
 
-void tls_sk_destruct(struct sock *sk, struct tls_context *ctx);
-void tls_icsk_clean_acked(struct sock *sk);
+void tls_clear_device_offload(struct sock *sk, struct tls_context *ctx);
+int tls_set_device_offload(struct sock *sk, struct tls_context *ctx);
+int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
+int tls_device_sendpage(struct sock *sk, struct page *page,
+			int offset, size_t size, int flags);
+void tls_device_sk_destruct(struct sock *sk);
+void tls_device_init(void);
+void tls_device_cleanup(void);
 
+struct tls_record_info *tls_get_record(struct tls_offload_context *context,
+				       u32 seq, u64 *p_record_sn);
+
+static inline bool tls_record_is_start_marker(struct tls_record_info *rec)
+{
+	return rec->len == 0;
+}
+
+static inline u32 tls_record_start_seq(struct tls_record_info *rec)
+{
+	return rec->end_seq - rec->len;
+}
+
+void tls_sk_destruct(struct sock *sk, struct tls_context *ctx);
 int tls_push_sg(struct sock *sk, struct tls_context *ctx,
 		struct scatterlist *sg, u16 first_offset,
 		int flags);
@@ -168,6 +215,13 @@ static inline bool tls_is_pending_open_record(struct tls_context *tls_ctx)
 	return tls_ctx->pending_open_record_frags;
 }
 
+static inline bool tls_is_sk_tx_device_offloaded(struct sock *sk)
+{
+	return sk_fullsock(sk) &&
+	       /* matches smp_store_release in tls_set_device_offload */
+	       smp_load_acquire(&sk->sk_destruct) == &tls_device_sk_destruct;
+}
+
 static inline void tls_err_abort(struct sock *sk)
 {
 	sk->sk_err = -EBADMSG;
@@ -255,4 +309,8 @@ static inline struct tls_offload_context *tls_offload_ctx(
 int tls_proccess_cmsg(struct sock *sk, struct msghdr *msg,
 		      unsigned char *record_type);
 
+int tls_sw_fallback_init(struct sock *sk,
+			 struct tls_offload_context *offload_ctx,
+			 struct tls_crypto_info *crypto_info);
+
 #endif /* _TLS_OFFLOAD_H */
diff --git a/net/tls/Kconfig b/net/tls/Kconfig
index eb583038c67e..1a4ea55c2f09 100644
--- a/net/tls/Kconfig
+++ b/net/tls/Kconfig
@@ -13,3 +13,12 @@ config TLS
 	encryption handling of the TLS protocol to be done in-kernel.
 
 	If unsure, say N.
+
+config TLS_DEVICE
+	bool "Transport Layer Security HW offload"
+	depends on TLS
+	default n
+	---help---
+	Enable kernel support for HW offload of the TLS protocol.
+
+	If unsure, say N.
diff --git a/net/tls/Makefile b/net/tls/Makefile
index a930fd1c4f7b..44483cd47b3a 100644
--- a/net/tls/Makefile
+++ b/net/tls/Makefile
@@ -5,3 +5,6 @@
 obj-$(CONFIG_TLS) += tls.o
 
 tls-y := tls_main.o tls_sw.o
+
+tls-$(CONFIG_TLS_DEVICE) += tls_device.o tls_device_fallback.o
+
diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
new file mode 100644
index 000000000000..5082d693a503
--- /dev/null
+++ b/net/tls/tls_device.c
@@ -0,0 +1,800 @@
+/* Copyright (c) 2016-2017, Mellanox Technologies All rights reserved.
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ *      - Neither the name of the Mellanox Technologies nor the
+ *        names of its contributors may be used to endorse or promote
+ *        products derived from this software without specific prior written
+ *        permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED.
+ * IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
+ * IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE
+ */
+
+#include <linux/module.h>
+#include <net/tcp.h>
+#include <net/inet_common.h>
+#include <linux/highmem.h>
+#include <linux/netdevice.h>
+
+#include <net/tls.h>
+#include <crypto/aead.h>
+
+/* device_offload_lock is used to synchronize tls_dev_add
+ * against NETDEV_DOWN notifications.
+ */
+DEFINE_STATIC_PERCPU_RWSEM(device_offload_lock);
+
+static void tls_device_gc_task(struct work_struct *work);
+
+static DECLARE_WORK(tls_device_gc_work, tls_device_gc_task);
+static LIST_HEAD(tls_device_gc_list);
+static LIST_HEAD(tls_device_list);
+static DEFINE_SPINLOCK(tls_device_lock);
+
+static void tls_device_free_ctx(struct tls_context *ctx)
+{
+	struct tls_offload_context *offlad_ctx = tls_offload_ctx(ctx);
+
+	kfree(offlad_ctx);
+	kfree(ctx);
+}
+
+static void tls_device_gc_task(struct work_struct *work)
+{
+	struct tls_context *ctx, *tmp;
+	struct list_head gc_list;
+	unsigned long flags;
+
+	spin_lock_irqsave(&tls_device_lock, flags);
+	INIT_LIST_HEAD(&gc_list);
+	list_splice_init(&tls_device_gc_list, &gc_list);
+	spin_unlock_irqrestore(&tls_device_lock, flags);
+
+	list_for_each_entry_safe(ctx, tmp, &gc_list, list) {
+		struct net_device *netdev = ctx->netdev;
+
+		if (netdev) {
+			netdev->tlsdev_ops->tls_dev_del(netdev, ctx,
+							TLS_OFFLOAD_CTX_DIR_TX);
+			dev_put(netdev);
+		}
+
+		list_del(&ctx->list);
+		tls_device_free_ctx(ctx);
+	}
+}
+
+static void tls_device_queue_ctx_destruction(struct tls_context *ctx)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&tls_device_lock, flags);
+	list_move_tail(&ctx->list, &tls_device_gc_list);
+
+	/* schedule_work inside the spinlock
+	 * to make sure tls_device_down waits for that work.
+	 */
+	schedule_work(&tls_device_gc_work);
+
+	spin_unlock_irqrestore(&tls_device_lock, flags);
+}
+
+/* We assume that the socket is already connected */
+static struct net_device *get_netdev_for_sock(struct sock *sk)
+{
+	struct inet_sock *inet = inet_sk(sk);
+	struct net_device *netdev = NULL;
+
+	netdev = dev_get_by_index(sock_net(sk), inet->cork.fl.flowi_oif);
+
+	return netdev;
+}
+
+static int attach_sock_to_netdev(struct sock *sk, struct net_device *netdev,
+				 struct tls_context *ctx)
+{
+	int rc;
+
+	rc = netdev->tlsdev_ops->tls_dev_add(netdev, sk, TLS_OFFLOAD_CTX_DIR_TX,
+					     &ctx->crypto_send);
+	if (rc) {
+		pr_err("The netdev has refused to offload this socket\n");
+		goto out;
+	}
+
+	rc = 0;
+out:
+	return rc;
+}
+
+static void destroy_record(struct tls_record_info *record)
+{
+	skb_frag_t *frag;
+	int nr_frags = record->num_frags;
+
+	while (nr_frags > 0) {
+		frag = &record->frags[nr_frags - 1];
+		__skb_frag_unref(frag);
+		--nr_frags;
+	}
+	kfree(record);
+}
+
+static void delete_all_records(struct tls_offload_context *offload_ctx)
+{
+	struct tls_record_info *info, *temp;
+
+	list_for_each_entry_safe(info, temp, &offload_ctx->records_list, list) {
+		list_del(&info->list);
+		destroy_record(info);
+	}
+
+	offload_ctx->retransmit_hint = NULL;
+}
+
+static void tls_icsk_clean_acked(struct sock *sk)
+{
+	struct tls_context *tls_ctx = tls_get_ctx(sk);
+	struct tls_offload_context *ctx;
+	struct tcp_sock *tp = tcp_sk(sk);
+	struct tls_record_info *info, *temp;
+	unsigned long flags;
+	u64 deleted_records = 0;
+
+	if (!tls_ctx)
+		return;
+
+	ctx = tls_offload_ctx(tls_ctx);
+
+	spin_lock_irqsave(&ctx->lock, flags);
+	info = ctx->retransmit_hint;
+	if (info && !before(tp->snd_una, info->end_seq)) {
+		ctx->retransmit_hint = NULL;
+		list_del(&info->list);
+		destroy_record(info);
+		deleted_records++;
+	}
+
+	list_for_each_entry_safe(info, temp, &ctx->records_list, list) {
+		if (before(tp->snd_una, info->end_seq))
+			break;
+		list_del(&info->list);
+
+		destroy_record(info);
+		deleted_records++;
+	}
+
+	ctx->unacked_record_sn += deleted_records;
+	spin_unlock_irqrestore(&ctx->lock, flags);
+}
+
+/* At this point, there should be no references on this
+ * socket and no in-flight SKBs associated with this
+ * socket, so it is safe to free all the resources.
+ */
+void tls_device_sk_destruct(struct sock *sk)
+{
+	struct tls_context *tls_ctx = tls_get_ctx(sk);
+	struct tls_offload_context *ctx = tls_offload_ctx(tls_ctx);
+
+	if (ctx->open_record)
+		destroy_record(ctx->open_record);
+
+	delete_all_records(ctx);
+	crypto_free_aead(ctx->aead_send);
+	ctx->sk_destruct(sk);
+
+	if (refcount_dec_and_test(&tls_ctx->refcount))
+		tls_device_queue_ctx_destruction(tls_ctx);
+}
+EXPORT_SYMBOL(tls_device_sk_destruct);
+
+static inline void tls_append_frag(struct tls_record_info *record,
+				   struct page_frag *pfrag,
+				   int size)
+{
+	skb_frag_t *frag;
+
+	frag = &record->frags[record->num_frags - 1];
+	if (frag->page.p == pfrag->page &&
+	    frag->page_offset + frag->size == pfrag->offset) {
+		frag->size += size;
+	} else {
+		++frag;
+		frag->page.p = pfrag->page;
+		frag->page_offset = pfrag->offset;
+		frag->size = size;
+		++record->num_frags;
+		get_page(pfrag->page);
+	}
+
+	pfrag->offset += size;
+	record->len += size;
+}
+
+static inline int tls_push_record(struct sock *sk,
+				  struct tls_context *ctx,
+				  struct tls_offload_context *offload_ctx,
+				  struct tls_record_info *record,
+				  struct page_frag *pfrag,
+				  int flags,
+				  unsigned char record_type)
+{
+	skb_frag_t *frag;
+	struct tcp_sock *tp = tcp_sk(sk);
+	struct page_frag fallback_frag;
+	struct page_frag  *tag_pfrag = pfrag;
+	int i;
+
+	/* fill prepand */
+	frag = &record->frags[0];
+	tls_fill_prepend(ctx,
+			 skb_frag_address(frag),
+			 record->len - ctx->prepend_size,
+			 record_type);
+
+	if (unlikely(!skb_page_frag_refill(ctx->tag_size, pfrag, GFP_KERNEL))) {
+		/* HW doesn't care about the data in the tag
+		 * so in case pfrag has no room
+		 * for a tag and we can't allocate a new pfrag
+		 * just use the page in the first frag
+		 * rather then write a complicated fall back code.
+		 */
+		tag_pfrag = &fallback_frag;
+		tag_pfrag->page = skb_frag_page(frag);
+		tag_pfrag->offset = 0;
+	}
+
+	tls_append_frag(record, tag_pfrag, ctx->tag_size);
+	record->end_seq = tp->write_seq + record->len;
+	spin_lock_irq(&offload_ctx->lock);
+	list_add_tail(&record->list, &offload_ctx->records_list);
+	spin_unlock_irq(&offload_ctx->lock);
+	offload_ctx->open_record = NULL;
+	set_bit(TLS_PENDING_CLOSED_RECORD, &ctx->flags);
+	tls_advance_record_sn(sk, ctx);
+
+	for (i = 0; i < record->num_frags; i++) {
+		frag = &record->frags[i];
+		sg_unmark_end(&offload_ctx->sg_tx_data[i]);
+		sg_set_page(&offload_ctx->sg_tx_data[i], skb_frag_page(frag),
+			    frag->size, frag->page_offset);
+		sk_mem_charge(sk, frag->size);
+		get_page(skb_frag_page(frag));
+	}
+	sg_mark_end(&offload_ctx->sg_tx_data[record->num_frags - 1]);
+
+	/* all ready, send */
+	return tls_push_sg(sk, ctx, offload_ctx->sg_tx_data, 0, flags);
+}
+
+static inline int tls_create_new_record(struct tls_offload_context *offload_ctx,
+					struct page_frag *pfrag,
+					size_t prepend_size)
+{
+	skb_frag_t *frag;
+	struct tls_record_info *record;
+
+	record = kmalloc(sizeof(*record), GFP_KERNEL);
+	if (!record)
+		return -ENOMEM;
+
+	frag = &record->frags[0];
+	__skb_frag_set_page(frag, pfrag->page);
+	frag->page_offset = pfrag->offset;
+	skb_frag_size_set(frag, prepend_size);
+
+	get_page(pfrag->page);
+	pfrag->offset += prepend_size;
+
+	record->num_frags = 1;
+	record->len = prepend_size;
+	offload_ctx->open_record = record;
+	return 0;
+}
+
+static inline int tls_do_allocation(struct sock *sk,
+				    struct tls_offload_context *offload_ctx,
+				    struct page_frag *pfrag,
+				    size_t prepend_size)
+{
+	int ret;
+
+	if (!offload_ctx->open_record) {
+		if (unlikely(!skb_page_frag_refill(prepend_size, pfrag,
+						   sk->sk_allocation))) {
+			sk->sk_prot->enter_memory_pressure(sk);
+			sk_stream_moderate_sndbuf(sk);
+			return -ENOMEM;
+		}
+
+		ret = tls_create_new_record(offload_ctx, pfrag, prepend_size);
+		if (ret)
+			return ret;
+
+		if (pfrag->size > pfrag->offset)
+			return 0;
+	}
+
+	if (!sk_page_frag_refill(sk, pfrag))
+		return -ENOMEM;
+
+	return 0;
+}
+
+static int tls_push_data(struct sock *sk,
+			 struct iov_iter *msg_iter,
+			 size_t size, int flags,
+			 unsigned char record_type)
+{
+	struct tls_context *tls_ctx = tls_get_ctx(sk);
+	struct tls_offload_context *ctx = tls_offload_ctx(tls_ctx);
+	struct tls_record_info *record = ctx->open_record;
+	struct page_frag *pfrag;
+	int copy, rc = 0;
+	size_t orig_size = size;
+	u32 max_open_record_len;
+	long timeo;
+	int more = flags & (MSG_SENDPAGE_NOTLAST | MSG_MORE);
+	int tls_push_record_flags = flags | MSG_SENDPAGE_NOTLAST;
+	bool done = false;
+
+	if (flags &
+	    ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_SENDPAGE_NOTLAST))
+		return -ENOTSUPP;
+
+	if (sk->sk_err)
+		return -sk->sk_err;
+
+	timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT);
+	rc = tls_complete_pending_work(sk, tls_ctx, flags, &timeo);
+	if (rc < 0)
+		return rc;
+
+	pfrag = sk_page_frag(sk);
+
+	/* KTLS_TLS_HEADER_SIZE is not counted as part of the TLS record, and
+	 * we need to leave room for an authentication tag.
+	 */
+	max_open_record_len = TLS_MAX_PAYLOAD_SIZE +
+			      tls_ctx->prepend_size;
+	do {
+		if (tls_do_allocation(sk, ctx, pfrag,
+				      tls_ctx->prepend_size)) {
+			rc = sk_stream_wait_memory(sk, &timeo);
+			if (!rc)
+				continue;
+
+			record = ctx->open_record;
+			if (!record)
+				break;
+handle_error:
+			if (record_type != TLS_RECORD_TYPE_DATA) {
+				/* avoid sending partial
+				 * record with type !=
+				 * application_data
+				 */
+				size = orig_size;
+				destroy_record(record);
+				ctx->open_record = NULL;
+			} else if (record->len > tls_ctx->prepend_size) {
+				goto last_record;
+			}
+
+			break;
+		}
+
+		record = ctx->open_record;
+		copy = min_t(size_t, size, (pfrag->size - pfrag->offset));
+		copy = min_t(size_t, copy, (max_open_record_len - record->len));
+
+		if (copy_from_iter_nocache(page_address(pfrag->page) +
+					       pfrag->offset,
+					   copy, msg_iter) != copy) {
+			rc = -EFAULT;
+			goto handle_error;
+		}
+		tls_append_frag(record, pfrag, copy);
+
+		size -= copy;
+		if (!size) {
+last_record:
+			tls_push_record_flags = flags;
+			if (more) {
+				tls_ctx->pending_open_record_frags =
+						record->num_frags;
+				break;
+			}
+
+			done = true;
+		}
+
+		if ((done) || record->len >= max_open_record_len ||
+		    (record->num_frags >= MAX_SKB_FRAGS - 1)) {
+			rc = tls_push_record(sk,
+					     tls_ctx,
+					     ctx,
+					     record,
+					     pfrag,
+					     tls_push_record_flags,
+					     record_type);
+			if (rc < 0)
+				break;
+		}
+	} while (!done);
+
+	if (orig_size - size > 0)
+		rc = orig_size - size;
+
+	return rc;
+}
+
+int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
+{
+	unsigned char record_type = TLS_RECORD_TYPE_DATA;
+	int rc = 0;
+
+	lock_sock(sk);
+
+	if (unlikely(msg->msg_controllen)) {
+		rc = tls_proccess_cmsg(sk, msg, &record_type);
+		if (rc)
+			goto out;
+	}
+
+	rc = tls_push_data(sk, &msg->msg_iter, size,
+			   msg->msg_flags, record_type);
+
+out:
+	release_sock(sk);
+	return rc;
+}
+
+int tls_device_sendpage(struct sock *sk, struct page *page,
+			int offset, size_t size, int flags)
+{
+	struct iov_iter	msg_iter;
+	struct kvec iov;
+	char *kaddr = kmap(page);
+	int rc = 0;
+
+	if (flags & MSG_SENDPAGE_NOTLAST)
+		flags |= MSG_MORE;
+
+	lock_sock(sk);
+
+	if (flags & MSG_OOB) {
+		rc = -ENOTSUPP;
+		goto out;
+	}
+
+	iov.iov_base = kaddr + offset;
+	iov.iov_len = size;
+	iov_iter_kvec(&msg_iter, WRITE | ITER_KVEC, &iov, 1, size);
+	rc = tls_push_data(sk, &msg_iter, size,
+			   flags, TLS_RECORD_TYPE_DATA);
+	kunmap(page);
+
+out:
+	release_sock(sk);
+	return rc;
+}
+
+struct tls_record_info *tls_get_record(struct tls_offload_context *context,
+				       u32 seq, u64 *p_record_sn)
+{
+	struct tls_record_info *info;
+	u64 record_sn = context->hint_record_sn;
+
+	info = context->retransmit_hint;
+	if (!info ||
+	    before(seq, info->end_seq - info->len)) {
+		/* if retransmit_hint is irrelevant start
+		 * from the begging of the list
+		 */
+		info = list_first_entry(&context->records_list,
+					struct tls_record_info, list);
+		record_sn = context->unacked_record_sn;
+	}
+
+	list_for_each_entry_from(info, &context->records_list, list) {
+		if (before(seq, info->end_seq)) {
+			if (!context->retransmit_hint ||
+			    after(info->end_seq,
+				  context->retransmit_hint->end_seq)) {
+				context->hint_record_sn = record_sn;
+				context->retransmit_hint = info;
+			}
+			*p_record_sn = record_sn;
+			return info;
+		}
+		record_sn++;
+	}
+
+	return NULL;
+}
+EXPORT_SYMBOL(tls_get_record);
+
+static int tls_device_push_pending_record(struct sock *sk, int flags)
+{
+	struct iov_iter	msg_iter;
+
+	iov_iter_kvec(&msg_iter, WRITE | ITER_KVEC, NULL, 0, 0);
+	return tls_push_data(sk, &msg_iter, 0, flags, TLS_RECORD_TYPE_DATA);
+}
+
+int tls_set_device_offload(struct sock *sk, struct tls_context *ctx)
+{
+	struct tls_crypto_info *crypto_info;
+	struct tls_offload_context *offload_ctx;
+	struct tls_record_info *start_marker_record;
+	u16 nonece_size, tag_size, iv_size, rec_seq_size;
+	char *iv, *rec_seq;
+	int rc;
+	struct net_device *netdev;
+	struct sk_buff *skb;
+
+	if (!ctx) {
+		rc = -EINVAL;
+		goto out;
+	}
+
+	if (ctx->priv_ctx) {
+		rc = -EEXIST;
+		goto out;
+	}
+
+	/* We support starting offload on multiple sockets
+	 * concurrently, So we only need a read lock here.
+	 */
+	percpu_down_read(&device_offload_lock);
+	netdev = get_netdev_for_sock(sk);
+	if (!netdev) {
+		pr_err("%s: netdev not found\n", __func__);
+		rc = -EINVAL;
+		goto release_lock;
+	}
+
+	if (!(netdev->features & NETIF_F_HW_TLS_TX)) {
+		rc = -ENOTSUPP;
+		goto release_netdev;
+	}
+
+	/* Avoid offloading if the device is down
+	 * We don't want to offload new flows after
+	 * the NETDEV_DOWN event
+	 */
+	if (!(netdev->flags & IFF_UP)) {
+		rc = -EINVAL;
+		goto release_lock;
+	}
+
+	crypto_info = &ctx->crypto_send;
+	switch (crypto_info->cipher_type) {
+	case TLS_CIPHER_AES_GCM_128: {
+		nonece_size = TLS_CIPHER_AES_GCM_128_IV_SIZE;
+		tag_size = TLS_CIPHER_AES_GCM_128_TAG_SIZE;
+		iv_size = TLS_CIPHER_AES_GCM_128_IV_SIZE;
+		iv = ((struct tls12_crypto_info_aes_gcm_128 *)crypto_info)->iv;
+		rec_seq_size = TLS_CIPHER_AES_GCM_128_REC_SEQ_SIZE;
+		rec_seq =
+		 ((struct tls12_crypto_info_aes_gcm_128 *)crypto_info)->rec_seq;
+		break;
+	}
+	default:
+		rc = -EINVAL;
+		goto release_netdev;
+	}
+
+	start_marker_record = kmalloc(sizeof(*start_marker_record), GFP_KERNEL);
+	if (!start_marker_record) {
+		rc = -ENOMEM;
+		goto release_netdev;
+	}
+
+	rc = attach_sock_to_netdev(sk, netdev, ctx);
+	if (rc)
+		goto free_marker_record;
+
+	ctx->netdev = netdev;
+
+	ctx->prepend_size = TLS_HEADER_SIZE + nonece_size;
+	ctx->tag_size = tag_size;
+	ctx->iv_size = iv_size;
+	ctx->iv = kmalloc(iv_size + TLS_CIPHER_AES_GCM_128_SALT_SIZE,
+			  GFP_KERNEL);
+	if (!ctx->iv) {
+		rc = -ENOMEM;
+		goto detach_sock;
+	}
+
+	memcpy(ctx->iv + TLS_CIPHER_AES_GCM_128_SALT_SIZE, rec_seq, iv_size);
+
+	ctx->rec_seq_size = rec_seq_size;
+	ctx->rec_seq = kmalloc(rec_seq_size, GFP_KERNEL);
+	if (!ctx->rec_seq) {
+		rc = -ENOMEM;
+		goto err_iv;
+	}
+	memcpy(ctx->rec_seq, rec_seq, rec_seq_size);
+
+	offload_ctx = ctx->priv_ctx;
+	memcpy(&offload_ctx->unacked_record_sn, rec_seq,
+	       sizeof(offload_ctx->unacked_record_sn));
+
+	/* start at rec_seq -1 to account for the start marker record */
+	offload_ctx->unacked_record_sn =
+		be64_to_cpu(offload_ctx->unacked_record_sn) - 1;
+
+	rc = tls_sw_fallback_init(sk, offload_ctx, crypto_info);
+	if (rc)
+		goto err_iv;
+
+	start_marker_record->end_seq = tcp_sk(sk)->write_seq;
+	start_marker_record->len = 0;
+	start_marker_record->num_frags = 0;
+
+	INIT_LIST_HEAD(&offload_ctx->records_list);
+	list_add_tail(&start_marker_record->list, &offload_ctx->records_list);
+	spin_lock_init(&offload_ctx->lock);
+
+	inet_csk(sk)->icsk_clean_acked = &tls_icsk_clean_acked;
+	ctx->push_pending_record = tls_device_push_pending_record;
+	offload_ctx->sk_destruct = sk->sk_destruct;
+
+	/* TLS offload is greatly simplified if we don't send
+	 * SKBs where only part of the payload needs to be encrypted.
+	 * So mark the last skb in the write queue as end of record.
+	 */
+	skb = tcp_write_queue_tail(sk);
+	if (skb)
+		TCP_SKB_CB(skb)->eor = 1;
+
+	refcount_set(&ctx->refcount, 1);
+	spin_lock_irq(&tls_device_lock);
+	list_add_tail(&ctx->list, &tls_device_list);
+	spin_unlock_irq(&tls_device_lock);
+
+	/* following this assignment tls_is_sk_tx_device_offloaded
+	 * will return true and the context might be accessed
+	 * by the netdev's xmit function.
+	 */
+	smp_store_release(&sk->sk_destruct,
+			  &tls_device_sk_destruct);
+	goto release_lock;
+
+err_iv:
+	kfree(ctx->iv);
+detach_sock:
+	netdev->tlsdev_ops->tls_dev_del(netdev, ctx, TLS_OFFLOAD_CTX_DIR_TX);
+free_marker_record:
+	kfree(start_marker_record);
+release_netdev:
+	dev_put(netdev);
+release_lock:
+	percpu_up_read(&device_offload_lock);
+out:
+	return rc;
+}
+
+static int tls_device_register(struct net_device *dev)
+{
+	if ((dev->features & NETIF_F_HW_TLS_TX) && !dev->tlsdev_ops)
+		return NOTIFY_BAD;
+
+	return NOTIFY_DONE;
+}
+
+static int tls_device_unregister(struct net_device *dev)
+{
+	return NOTIFY_DONE;
+}
+
+static int tls_device_feat_change(struct net_device *dev)
+{
+	if ((dev->features & NETIF_F_HW_TLS_TX) && !dev->tlsdev_ops)
+		return NOTIFY_BAD;
+
+	return NOTIFY_DONE;
+}
+
+static int tls_device_down(struct net_device *netdev)
+{
+	struct tls_context *ctx, *tmp;
+	struct list_head list;
+	unsigned long flags;
+
+	if (!(netdev->features & NETIF_F_HW_TLS_TX))
+		return NOTIFY_DONE;
+
+	/* Request a write lock to block new offload attempts
+	 */
+	percpu_down_write(&device_offload_lock);
+
+	spin_lock_irqsave(&tls_device_lock, flags);
+	INIT_LIST_HEAD(&list);
+
+	list_for_each_entry_safe(ctx, tmp, &tls_device_list, list) {
+		if (ctx->netdev != netdev ||
+		    !refcount_inc_not_zero(&ctx->refcount))
+			continue;
+
+		list_move(&ctx->list, &list);
+	}
+	spin_unlock_irqrestore(&tls_device_lock, flags);
+
+	list_for_each_entry_safe(ctx, tmp, &list, list)	{
+		netdev->tlsdev_ops->tls_dev_del(netdev, ctx,
+						TLS_OFFLOAD_CTX_DIR_TX);
+		ctx->netdev = NULL;
+		dev_put(netdev);
+		list_del_init(&ctx->list);
+
+		if (refcount_dec_and_test(&ctx->refcount))
+			tls_device_free_ctx(ctx);
+	}
+
+	percpu_up_write(&device_offload_lock);
+
+	flush_work(&tls_device_gc_work);
+
+	return NOTIFY_DONE;
+}
+
+static int tls_dev_event(struct notifier_block *this, unsigned long event,
+			 void *ptr)
+{
+	struct net_device *dev = netdev_notifier_info_to_dev(ptr);
+
+	switch (event) {
+	case NETDEV_REGISTER:
+		return tls_device_register(dev);
+
+	case NETDEV_UNREGISTER:
+		return tls_device_unregister(dev);
+
+	case NETDEV_FEAT_CHANGE:
+		return tls_device_feat_change(dev);
+
+	case NETDEV_DOWN:
+		return tls_device_down(dev);
+	}
+	return NOTIFY_DONE;
+}
+
+static struct notifier_block tls_dev_notifier = {
+	.notifier_call	= tls_dev_event,
+};
+
+void __init tls_device_init(void)
+{
+	register_netdevice_notifier(&tls_dev_notifier);
+}
+
+void __exit tls_device_cleanup(void)
+{
+	unregister_netdevice_notifier(&tls_dev_notifier);
+	flush_work(&tls_device_gc_work);
+}
diff --git a/net/tls/tls_device_fallback.c b/net/tls/tls_device_fallback.c
new file mode 100644
index 000000000000..95d04d6ef0e4
--- /dev/null
+++ b/net/tls/tls_device_fallback.c
@@ -0,0 +1,405 @@
+/* Copyright (c) 2016-2017, Mellanox Technologies All rights reserved.
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ *      - Neither the name of the Mellanox Technologies nor the
+ *        names of its contributors may be used to endorse or promote
+ *        products derived from this software without specific prior written
+ *        permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED.
+ * IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
+ * IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE
+ */
+
+#include <net/tls.h>
+#include <crypto/aead.h>
+#include <crypto/scatterwalk.h>
+
+static void chain_to_walk(struct scatterlist *sg, struct scatter_walk *walk)
+{
+	struct scatterlist *src = walk->sg;
+	int diff = walk->offset - src->offset;
+
+	sg_set_page(sg, sg_page(src),
+		    src->length - diff, walk->offset);
+
+	scatterwalk_crypto_chain(sg, sg_next(src), 0, 2);
+}
+
+static int tls_enc_record(struct aead_request *aead_req,
+			  struct crypto_aead *aead, char *aad, char *iv,
+			  __be64 rcd_sn, struct scatter_walk *in,
+			  struct scatter_walk *out, int *in_len)
+{
+	struct scatterlist sg_in[3];
+	struct scatterlist sg_out[3];
+	unsigned char buf[TLS_HEADER_SIZE + TLS_CIPHER_AES_GCM_128_IV_SIZE];
+	u16 len;
+	int rc;
+
+	len = min_t(int, *in_len, ARRAY_SIZE(buf));
+
+	scatterwalk_copychunks(buf, in, len, 0);
+	scatterwalk_copychunks(buf, out, len, 1);
+
+	*in_len -= len;
+	if (!*in_len)
+		return 0;
+
+	scatterwalk_pagedone(in, 0, 1);
+	scatterwalk_pagedone(out, 1, 1);
+
+	len = buf[4] | (buf[3] << 8);
+	len -= TLS_CIPHER_AES_GCM_128_IV_SIZE;
+
+	tls_make_aad(aad, len - TLS_CIPHER_AES_GCM_128_TAG_SIZE,
+		     (char *)&rcd_sn, sizeof(rcd_sn), buf[0]);
+
+	memcpy(iv + TLS_CIPHER_AES_GCM_128_SALT_SIZE, buf + TLS_HEADER_SIZE,
+	       TLS_CIPHER_AES_GCM_128_IV_SIZE);
+
+	sg_init_table(sg_in, ARRAY_SIZE(sg_in));
+	sg_init_table(sg_out, ARRAY_SIZE(sg_out));
+	sg_set_buf(sg_in, aad, TLS_AAD_SPACE_SIZE);
+	sg_set_buf(sg_out, aad, TLS_AAD_SPACE_SIZE);
+	chain_to_walk(sg_in + 1, in);
+	chain_to_walk(sg_out + 1, out);
+
+	*in_len -= len;
+	if (*in_len < 0) {
+		*in_len += TLS_CIPHER_AES_GCM_128_TAG_SIZE;
+		if (*in_len < 0)
+		/* the input buffer doesn't contain the entire record.
+		 * trim len accordingly. The resulting authentication tag
+		 * will contain garbage. but we don't care as we won't
+		 * include any of it in the output skb
+		 * Note that we assume the output buffer length
+		 * is larger then input buffer length + tag size
+		 */
+			len += *in_len;
+
+		*in_len = 0;
+	}
+
+	if (*in_len) {
+		scatterwalk_copychunks(NULL, in, len, 2);
+		scatterwalk_pagedone(in, 0, 1);
+		scatterwalk_copychunks(NULL, out, len, 2);
+		scatterwalk_pagedone(out, 1, 1);
+	}
+
+	len -= TLS_CIPHER_AES_GCM_128_TAG_SIZE;
+	aead_request_set_crypt(aead_req, sg_in, sg_out, len, iv);
+
+	rc = crypto_aead_encrypt(aead_req);
+
+	return rc;
+}
+
+static void tls_init_aead_request(struct aead_request *aead_req,
+				  struct crypto_aead *aead)
+{
+	aead_request_set_tfm(aead_req, aead);
+	aead_request_set_ad(aead_req, TLS_AAD_SPACE_SIZE);
+}
+
+static struct aead_request *tls_alloc_aead_request(struct crypto_aead *aead,
+						   gfp_t flags)
+{
+	unsigned int req_size = sizeof(struct aead_request) +
+		crypto_aead_reqsize(aead);
+	struct aead_request *aead_req;
+
+	aead_req = kzalloc(req_size, flags);
+	if (!aead)
+		return NULL;
+
+	tls_init_aead_request(aead_req, aead);
+	return aead_req;
+}
+
+static int tls_enc_records(struct aead_request *aead_req,
+			   struct crypto_aead *aead, struct scatterlist *sg_in,
+			   struct scatterlist *sg_out, char *aad, char *iv,
+			   u64 rcd_sn, int len)
+{
+	struct scatter_walk in;
+	struct scatter_walk out;
+	int rc;
+
+	scatterwalk_start(&in, sg_in);
+	scatterwalk_start(&out, sg_out);
+
+	do {
+		rc = tls_enc_record(aead_req, aead, aad, iv,
+				    cpu_to_be64(rcd_sn), &in, &out, &len);
+		rcd_sn++;
+
+	} while (rc == 0 && len);
+
+	scatterwalk_done(&in, 0, 0);
+	scatterwalk_done(&out, 1, 0);
+
+	return rc;
+}
+
+static inline void update_chksum(struct sk_buff *skb, int headln)
+{
+	/* Can't use icsk->icsk_af_ops->send_check here because the ip addresses
+	 * might have been changed by NAT.
+	 */
+
+	const struct ipv6hdr *ipv6h;
+	const struct iphdr *iph;
+	struct tcphdr *th = tcp_hdr(skb);
+	int datalen = skb->len - headln;
+
+	/* We only changed the payload so if we are using partial we don't
+	 * need to update anything.
+	 */
+	if (likely(skb->ip_summed == CHECKSUM_PARTIAL))
+		return;
+
+	skb->ip_summed = CHECKSUM_PARTIAL;
+	skb->csum_start = skb_transport_header(skb) - skb->head;
+	skb->csum_offset = offsetof(struct tcphdr, check);
+
+	if (skb->sk->sk_family == AF_INET6) {
+		ipv6h = ipv6_hdr(skb);
+		th->check = ~csum_ipv6_magic(&ipv6h->saddr, &ipv6h->daddr,
+					     datalen, IPPROTO_TCP, 0);
+	} else {
+		iph = ip_hdr(skb);
+		th->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, datalen,
+					       IPPROTO_TCP, 0);
+	}
+}
+
+static void complete_skb(struct sk_buff *nskb, struct sk_buff *skb, int headln)
+{
+	skb_copy_header(nskb, skb);
+
+	skb_put(nskb, skb->len);
+	memcpy(nskb->data, skb->data, headln);
+	update_chksum(nskb, headln);
+
+	nskb->destructor = skb->destructor;
+	nskb->sk = skb->sk;
+	skb->destructor = NULL;
+	skb->sk = NULL;
+	refcount_add(nskb->truesize - skb->truesize,
+		     &nskb->sk->sk_wmem_alloc);
+}
+
+/* This function may be called after the user socket is already
+ * closed so make sure we don't use anything freed during
+ * tls_sk_proto_close here
+ */
+struct sk_buff *tls_sw_fallback(struct sock *sk, struct sk_buff *skb)
+{
+	int tcp_header_size = tcp_hdrlen(skb);
+	int tcp_payload_offset = skb_transport_offset(skb) + tcp_header_size;
+	int payload_len = skb->len - tcp_payload_offset;
+	struct tls_context *tls_ctx = tls_get_ctx(sk);
+	struct tls_offload_context *ctx = tls_offload_ctx(tls_ctx);
+	int remaining, buf_len, resync_sgs, rc, i = 0;
+	void *buf, *dummy_buf, *iv, *aad;
+	struct scatterlist sg_in[2 * (MAX_SKB_FRAGS + 1)];
+	struct scatterlist sg_out[3];
+	u32 tcp_seq = ntohl(tcp_hdr(skb)->seq);
+	struct aead_request *aead_req;
+	struct sk_buff *nskb = NULL;
+	struct tls_record_info *record;
+	unsigned long flags;
+	s32 sync_size;
+	u64 rcd_sn;
+
+	if (!payload_len)
+		return skb;
+
+	sg_init_table(sg_in, ARRAY_SIZE(sg_in));
+	sg_init_table(sg_out, ARRAY_SIZE(sg_out));
+
+	spin_lock_irqsave(&ctx->lock, flags);
+	record = tls_get_record(ctx, tcp_seq, &rcd_sn);
+	if (!record) {
+		spin_unlock_irqrestore(&ctx->lock, flags);
+		WARN(1, "Record not found for seq %u\n", tcp_seq);
+		goto free_orig;
+	}
+
+	sync_size = tcp_seq - tls_record_start_seq(record);
+	if (sync_size < 0) {
+		int is_start_marker = tls_record_is_start_marker(record);
+
+		spin_unlock_irqrestore(&ctx->lock, flags);
+		if (!is_start_marker)
+		/* This should only occur if the relevant record was
+		 * already acked. In that case it should be ok
+		 * to drop the packet and avoid retransmission.
+		 *
+		 * There is a corner case where the packet contains
+		 * both an acked and a non-acked record.
+		 * We currently don't handle that case and rely
+		 * on TCP to retranmit a packet that doesn't contain
+		 * already acked payload.
+		 */
+			goto free_orig;
+
+		if (payload_len > -sync_size) {
+			WARN(1, "Fallback of partially offloaded packets is not supported\n");
+			goto free_orig;
+		} else {
+			return skb;
+		}
+	}
+
+	remaining = sync_size;
+	while (remaining > 0) {
+		skb_frag_t *frag = &record->frags[i];
+
+		__skb_frag_ref(frag);
+		sg_set_page(sg_in + i, skb_frag_page(frag),
+			    skb_frag_size(frag), frag->page_offset);
+
+		remaining -= skb_frag_size(frag);
+
+		if (remaining < 0)
+			sg_in[i].length += remaining;
+
+		i++;
+	}
+	spin_unlock_irqrestore(&ctx->lock, flags);
+	resync_sgs = i;
+
+	aead_req = tls_alloc_aead_request(ctx->aead_send, GFP_ATOMIC);
+	if (!aead_req)
+		goto put_sg;
+
+	buf_len = TLS_CIPHER_AES_GCM_128_SALT_SIZE +
+		  TLS_CIPHER_AES_GCM_128_IV_SIZE +
+		  TLS_AAD_SPACE_SIZE +
+		  sync_size +
+		  tls_ctx->tag_size;
+	buf = kmalloc(buf_len, GFP_ATOMIC);
+	if (!buf)
+		goto free_req;
+
+	nskb = alloc_skb(skb_headroom(skb) + skb->len, GFP_ATOMIC);
+	if (!nskb)
+		goto free_req;
+
+	skb_reserve(nskb, skb_headroom(skb));
+
+	iv = buf;
+
+	memcpy(iv, tls_ctx->crypto_send_aes_gcm_128.salt,
+	       TLS_CIPHER_AES_GCM_128_SALT_SIZE);
+	aad = buf + TLS_CIPHER_AES_GCM_128_SALT_SIZE +
+	      TLS_CIPHER_AES_GCM_128_IV_SIZE;
+	dummy_buf = aad + TLS_AAD_SPACE_SIZE;
+
+	sg_set_buf(&sg_out[0], dummy_buf, sync_size);
+	sg_set_buf(&sg_out[1], nskb->data + tcp_payload_offset,
+		   payload_len);
+	/* Add room for authentication tag produced by crypto */
+	dummy_buf += sync_size;
+	sg_set_buf(&sg_out[2], dummy_buf, tls_ctx->tag_size);
+	rc = skb_to_sgvec(skb, &sg_in[i], tcp_payload_offset,
+			  payload_len);
+	if (rc < 0)
+		goto free_nskb;
+
+	rc = tls_enc_records(aead_req, ctx->aead_send, sg_in, sg_out, aad, iv,
+			     rcd_sn, sync_size + payload_len);
+	if (rc < 0)
+		goto free_nskb;
+
+	complete_skb(nskb, skb, tcp_payload_offset);
+
+	/* validate_xmit_skb_list assumes that if the skb wasn't segmented
+	 * nskb->prev will point to the skb itself
+	 */
+	nskb->prev = nskb;
+free_buf:
+	kfree(buf);
+free_req:
+	kfree(aead_req);
+put_sg:
+	for (i = 0; i < resync_sgs; i++)
+		put_page(sg_page(&sg_in[i]));
+free_orig:
+	kfree_skb(skb);
+	return nskb;
+
+free_nskb:
+	kfree_skb(nskb);
+	nskb = NULL;
+	goto free_buf;
+}
+
+static struct sk_buff *
+tls_validate_xmit(struct sock *sk, struct net_device *dev, struct sk_buff *skb)
+{
+	if (dev == tls_get_ctx(sk)->netdev)
+		return skb;
+
+	return tls_sw_fallback(sk, skb);
+}
+
+int tls_sw_fallback_init(struct sock *sk,
+			 struct tls_offload_context *offload_ctx,
+			 struct tls_crypto_info *crypto_info)
+{
+	int rc;
+	const u8 *key;
+
+	offload_ctx->aead_send =
+	    crypto_alloc_aead("gcm(aes)", 0, CRYPTO_ALG_ASYNC);
+	if (IS_ERR(offload_ctx->aead_send)) {
+		pr_err("crypto_alloc_aead failed\n");
+		rc = PTR_ERR(offload_ctx->aead_send);
+		offload_ctx->aead_send = NULL;
+		goto err_out;
+	}
+
+	key = ((struct tls12_crypto_info_aes_gcm_128 *)crypto_info)->key;
+
+	rc = crypto_aead_setkey(offload_ctx->aead_send, key,
+				TLS_CIPHER_AES_GCM_128_KEY_SIZE);
+	if (rc)
+		goto free_aead;
+
+	rc = crypto_aead_setauthsize(offload_ctx->aead_send,
+				     TLS_CIPHER_AES_GCM_128_TAG_SIZE);
+	if (rc)
+		goto free_aead;
+
+	sk->sk_offload_check = tls_validate_xmit;
+	return 0;
+free_aead:
+	crypto_free_aead(offload_ctx->aead_send);
+err_out:
+	return rc;
+}
diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index e07ee3ae0023..f5a211cc30b1 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -48,6 +48,9 @@ MODULE_LICENSE("Dual BSD/GPL");
 enum {
 	TLS_BASE_TX,
 	TLS_SW_TX,
+#ifdef CONFIG_TLS_DEVICE
+	TLS_HW_TX,
+#endif
 	TLS_NUM_CONFIG,
 };
 
@@ -401,11 +404,19 @@ static int do_tls_setsockopt_tx(struct sock *sk, char __user *optval,
 		goto out;
 	}
 
-	/* currently SW is default, we will have ethtool in future */
-	rc = tls_set_sw_offload(sk, ctx);
-	tx_conf = TLS_SW_TX;
-	if (rc)
-		goto err_crypto_info;
+#ifdef CONFIG_TLS_DEVICE
+	rc = tls_set_device_offload(sk, ctx);
+	tx_conf = TLS_HW_TX;
+	if (rc) {
+#else
+	{
+#endif
+		/* if HW offload fails fallback to SW */
+		rc = tls_set_sw_offload(sk, ctx);
+		tx_conf = TLS_SW_TX;
+		if (rc)
+			goto err_crypto_info;
+	}
 
 	ctx->tx_conf = tx_conf;
 	update_sk_prot(sk, ctx);
@@ -487,12 +498,21 @@ static void build_protos(struct proto *prot, struct proto *base)
 	prot[TLS_SW_TX] = prot[TLS_BASE_TX];
 	prot[TLS_SW_TX].sendmsg		= tls_sw_sendmsg;
 	prot[TLS_SW_TX].sendpage	= tls_sw_sendpage;
+
+#ifdef CONFIG_TLS_DEVICE
+	prot[TLS_HW_TX] = prot[TLS_SW_TX];
+	prot[TLS_HW_TX].sendmsg		= tls_device_sendmsg;
+	prot[TLS_HW_TX].sendpage	= tls_device_sendpage;
+#endif
 }
 
 static int __init tls_register(void)
 {
 	build_protos(tls_prots, &tcp_prot);
 
+#ifdef CONFIG_TLS_DEVICE
+	tls_device_init();
+#endif
 	tcp_register_ulp(&tcp_tls_ulp_ops);
 
 	return 0;
@@ -501,6 +521,9 @@ static int __init tls_register(void)
 static void __exit tls_unregister(void)
 {
 	tcp_unregister_ulp(&tcp_tls_ulp_ops);
+#ifdef CONFIG_TLS_DEVICE
+	tls_device_cleanup();
+#endif
 }
 
 module_init(tls_register);
-- 
2.15.0.317.g14c63a9

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 net-next 0/6] tls: Add generic NIC offload infrastructure
  2017-12-18 11:10 [PATCH v3 net-next 0/6] tls: Add generic NIC offload infrastructure Ilya Lesokhin
                   ` (5 preceding siblings ...)
  2017-12-18 11:10 ` [PATCH v3 net-next 6/6] tls: Add generic NIC offload infrastructure Ilya Lesokhin
@ 2017-12-18 17:10 ` Jiri Pirko
  2017-12-19 10:30   ` Jiri Pirko
  6 siblings, 1 reply; 35+ messages in thread
From: Jiri Pirko @ 2017-12-18 17:10 UTC (permalink / raw)
  To: Ilya Lesokhin
  Cc: netdev, davem, davejwatson, tom, hannes, borisp, aviadye, liranl

Mon, Dec 18, 2017 at 12:10:27PM CET, ilyal@mellanox.com wrote:
>Changes from v2:
>- Fix sk use after free and possible netdev use after free
>- tls device now keeps a refernce on the offloading netdev
>- tls device registers to the netdev notifer. 
>  Upon a NETDEV_DOWN event, offload is stopped and
>  the reference on the netdev is dropped.
>- SW fallback support for skb->ip_summed != CHECKSUM_PARTIAL 
>- Merged TLS patches are no longer part of this series.
>
>Changes from v1:
>- Remove the binding of the socket to a specific netdev 
>  through sk->sk_bound_dev_if.
>  Add a check in validate_xmit_skb to detect route changes
>  and call SW fallback code to do the crypto in software.
>- tls_get_record now returns the tls record sequence number.
>  This is required to support connections with rcd_sn != iv.
>- Bug fixes to the TLS code.
>
>This patchset adds a generic infrastructure to offload TLS crypto to a
>network devices.
>
>patches 1-2 Export functions that we need
>patch 3 adds infrastructue for offloaded socket fallback
>patches 4-5 add new NDOs and capabilities.
>patch 6 adds the TLS NIC offload infrastructure.
>
>Github with mlx5e TLS offload support:
>https://github.com/Mellanox/tls-offload/tree/tls_device_v3

I don't get it. You are pushing infra but not the actual driver part
who is consuming the infra? Why?

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 net-next 3/6] net: Add SW fallback infrastructure for offloaded sockets
  2017-12-18 11:10 ` [PATCH v3 net-next 3/6] net: Add SW fallback infrastructure for offloaded sockets Ilya Lesokhin
@ 2017-12-18 19:18   ` Marcelo Ricardo Leitner
  2017-12-19  7:51     ` Ilya Lesokhin
  2017-12-19 19:12   ` Eric Dumazet
  1 sibling, 1 reply; 35+ messages in thread
From: Marcelo Ricardo Leitner @ 2017-12-18 19:18 UTC (permalink / raw)
  To: Ilya Lesokhin
  Cc: netdev, davem, davejwatson, tom, hannes, borisp, aviadye, liranl

On Mon, Dec 18, 2017 at 01:10:30PM +0200, Ilya Lesokhin wrote:
> Offloaded sockets rely on the netdev to transform the transmitted
> packets before sending them over the network.
> When a packet from an offloaded socket is looped back or
> rerouted to a different device we need to detect it and
> do the transformation in software
> 
> Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
> Signed-off-by: Boris Pismenny <borisp@mellanox.com>
> ---
>  include/net/sock.h | 17 +++++++++++++++++
>  net/core/dev.c     |  4 ++++
>  2 files changed, 21 insertions(+)
> 
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 9a9047268d37..5397307603ec 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -479,6 +479,9 @@ struct sock {
>  	void			(*sk_error_report)(struct sock *sk);
>  	int			(*sk_backlog_rcv)(struct sock *sk,
>  						  struct sk_buff *skb);
> +	struct sk_buff*		(*sk_offload_check)(struct sock *sk,
> +						    struct net_device *dev,
> +						    struct sk_buff *skb);
>  	void                    (*sk_destruct)(struct sock *sk);
>  	struct sock_reuseport __rcu	*sk_reuseport_cb;
>  	struct rcu_head		sk_rcu;
> @@ -2324,6 +2327,20 @@ static inline bool sk_fullsock(const struct sock *sk)
>  	return (1 << sk->sk_state) & ~(TCPF_TIME_WAIT | TCPF_NEW_SYN_RECV);
>  }
>  
> +/* Checks if this SKB belongs to an HW offloaded socket
> + * and whether any SW fallbacks are required based on dev.
> + */
> +static inline struct sk_buff *skb_offload_check(struct sk_buff *skb,
> +						struct net_device *dev)
> +{
> +	struct sock *sk = skb->sk;
> +
> +	if (sk && sk_fullsock(sk) && sk->sk_offload_check)

Isn't this going to hurt the fast path, checking for sk fields here?

> +		skb = sk->sk_offload_check(sk, dev, skb);
> +
> +	return skb;
> +}
> +
>  /* This helper checks if a socket is a LISTEN or NEW_SYN_RECV
>   * SYNACK messages can be attached to either ones (depending on SYNCOOKIE)
>   */
> diff --git a/net/core/dev.c b/net/core/dev.c
> index b0eee49a2489..6a78d9046674 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -3051,6 +3051,10 @@ static struct sk_buff *validate_xmit_skb(struct sk_buff *skb, struct net_device
>  	if (unlikely(!skb))
>  		goto out_null;
>  
> +	skb = skb_offload_check(skb, dev);
> +	if (!skb)
> +		goto out_null;
> +
>  	if (netif_needs_gso(skb, features)) {
>  		struct sk_buff *segs;
>  
> -- 
> 2.15.0.317.g14c63a9
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 net-next 6/6] tls: Add generic NIC offload infrastructure.
  2017-12-18 11:10 ` [PATCH v3 net-next 6/6] tls: Add generic NIC offload infrastructure Ilya Lesokhin
@ 2017-12-18 19:53   ` Marcelo Ricardo Leitner
  2017-12-19  7:31     ` Ilya Lesokhin
  2017-12-19  7:00   ` kbuild test robot
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 35+ messages in thread
From: Marcelo Ricardo Leitner @ 2017-12-18 19:53 UTC (permalink / raw)
  To: Ilya Lesokhin
  Cc: netdev, davem, davejwatson, tom, hannes, borisp, aviadye, liranl

On Mon, Dec 18, 2017 at 01:10:33PM +0200, Ilya Lesokhin wrote:
> This patch adds a generic infrastructure to offload TLS crypto to a
> network devices. It enables the kernel TLS socket to skip encryption
> and authentication operations on the transmit side of the data path.
> Leaving those computationally expensive operations to the NIC.

I have a hard time understanding why this was named 'tls_device' if no
net_device's are registered.

> 
> The NIC offload infrastructure builds TLS records and pushes them to
> the TCP layer just like the SW KTLS implementation and using the same API.
> TCP segmentation is mostly unaffected. Currently the only exception is
> that we prevent mixed SKBs where only part of the payload requires
> offload. In the future we are likely to add a similar restriction
> following a change cipher spec record.
> 
> The notable differences between SW KTLS and NIC offloaded TLS
> implementations are as follows:
> 1. The offloaded implementation builds "plaintext TLS record", those
> records contain plaintext instead of ciphertext and place holder bytes
> instead of authentication tags.
> 2. The offloaded implementation maintains a mapping from TCP sequence
> number to TLS records. Thus given a TCP SKB sent from a NIC offloaded
> TLS socket, we can use the tls NIC offload infrastructure to obtain
> enough context to encrypt the payload of the SKB.
> A TLS record is released when the last byte of the record is ack'ed,
> this is done through the new icsk_clean_acked callback.
> 
> The infrastructure should be extendable to support various NIC offload
> implementations.  However it is currently written with the
> implementation below in mind:
> The NIC assumes that packets from each offloaded stream are sent as
> plaintext and in-order. It keeps track of the TLS records in the TCP
> stream. When a packet marked for offload is transmitted, the NIC
> encrypts the payload in-place and puts authentication tags in the
> relevant place holders.
> 
> The responsibility for handling out-of-order packets (i.e. TCP
> retransmission, qdisc drops) falls on the netdev driver.
> 
> The netdev driver keeps track of the expected TCP SN from the NIC's
> perspective.  If the next packet to transmit matches the expected TCP
> SN, the driver advances the expected TCP SN, and transmits the packet
> with TLS offload indication.
> 
> If the next packet to transmit does not match the expected TCP SN. The
> driver calls the TLS layer to obtain the TLS record that includes the
> TCP of the packet for transmission. Using this TLS record, the driver
> posts a work entry on the transmit queue to reconstruct the NIC TLS
> state required for the offload of the out-of-order packet. It updates
> the expected TCP SN accordingly and transmit the now in-order packet.
> The same queue is used for packet transmission and TLS context
> reconstruction to avoid the need for flushing the transmit queue before
> issuing the context reconstruction request.
> 
> Signed-off-by: Boris Pismenny <borisp@mellanox.com>
> Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
> Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
> ---
>  include/net/tls.h             |  62 +++-
>  net/tls/Kconfig               |   9 +
>  net/tls/Makefile              |   3 +
>  net/tls/tls_device.c          | 800 ++++++++++++++++++++++++++++++++++++++++++
>  net/tls/tls_device_fallback.c | 405 +++++++++++++++++++++
>  net/tls/tls_main.c            |  33 +-
>  6 files changed, 1305 insertions(+), 7 deletions(-)
>  create mode 100644 net/tls/tls_device.c
>  create mode 100644 net/tls/tls_device_fallback.c
> 
> diff --git a/include/net/tls.h b/include/net/tls.h
> index 936cfc5cab7d..9c1b5d13d9a7 100644
> --- a/include/net/tls.h
> +++ b/include/net/tls.h
> @@ -75,6 +75,29 @@ struct tls_sw_context {
>  	struct scatterlist sg_aead_out[2];
>  };
>  
> +struct tls_record_info {
> +	struct list_head list;
> +	u32 end_seq;
> +	int len;
> +	int num_frags;
> +	skb_frag_t frags[MAX_SKB_FRAGS];
> +};
> +
> +struct tls_offload_context {
> +	struct crypto_aead *aead_send;
> +
> +	struct list_head records_list;
> +	struct scatterlist sg_tx_data[MAX_SKB_FRAGS];
> +	void (*sk_destruct)(struct sock *sk);
> +	struct tls_record_info *open_record;
> +	struct tls_record_info *retransmit_hint;
> +	u64 hint_record_sn;
> +	u64 unacked_record_sn;
> +
> +	u32 expected_seq;
> +	spinlock_t lock;	/* protects records list */
> +};
> +
>  enum {
>  	TLS_PENDING_CLOSED_RECORD
>  };
> @@ -85,6 +108,10 @@ struct tls_context {
>  		struct tls12_crypto_info_aes_gcm_128 crypto_send_aes_gcm_128;
>  	};
>  
> +	struct list_head list;
> +	struct net_device *netdev;
> +	refcount_t refcount;
> +
>  	void *priv_ctx;
>  
>  	u8 tx_conf:2;
> @@ -129,9 +156,29 @@ int tls_sw_sendpage(struct sock *sk, struct page *page,
>  void tls_sw_close(struct sock *sk, long timeout);
>  void tls_sw_free_tx_resources(struct sock *sk);
>  
> -void tls_sk_destruct(struct sock *sk, struct tls_context *ctx);
> -void tls_icsk_clean_acked(struct sock *sk);
> +void tls_clear_device_offload(struct sock *sk, struct tls_context *ctx);
> +int tls_set_device_offload(struct sock *sk, struct tls_context *ctx);
> +int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
> +int tls_device_sendpage(struct sock *sk, struct page *page,
> +			int offset, size_t size, int flags);
> +void tls_device_sk_destruct(struct sock *sk);
> +void tls_device_init(void);
> +void tls_device_cleanup(void);
>  
> +struct tls_record_info *tls_get_record(struct tls_offload_context *context,
> +				       u32 seq, u64 *p_record_sn);
> +
> +static inline bool tls_record_is_start_marker(struct tls_record_info *rec)
> +{
> +	return rec->len == 0;
> +}
> +
> +static inline u32 tls_record_start_seq(struct tls_record_info *rec)
> +{
> +	return rec->end_seq - rec->len;
> +}
> +
> +void tls_sk_destruct(struct sock *sk, struct tls_context *ctx);
>  int tls_push_sg(struct sock *sk, struct tls_context *ctx,
>  		struct scatterlist *sg, u16 first_offset,
>  		int flags);
> @@ -168,6 +215,13 @@ static inline bool tls_is_pending_open_record(struct tls_context *tls_ctx)
>  	return tls_ctx->pending_open_record_frags;
>  }
>  
> +static inline bool tls_is_sk_tx_device_offloaded(struct sock *sk)
> +{
> +	return sk_fullsock(sk) &&
> +	       /* matches smp_store_release in tls_set_device_offload */
> +	       smp_load_acquire(&sk->sk_destruct) == &tls_device_sk_destruct;
> +}
> +
>  static inline void tls_err_abort(struct sock *sk)
>  {
>  	sk->sk_err = -EBADMSG;
> @@ -255,4 +309,8 @@ static inline struct tls_offload_context *tls_offload_ctx(
>  int tls_proccess_cmsg(struct sock *sk, struct msghdr *msg,
>  		      unsigned char *record_type);
>  
> +int tls_sw_fallback_init(struct sock *sk,
> +			 struct tls_offload_context *offload_ctx,
> +			 struct tls_crypto_info *crypto_info);
> +
>  #endif /* _TLS_OFFLOAD_H */
> diff --git a/net/tls/Kconfig b/net/tls/Kconfig
> index eb583038c67e..1a4ea55c2f09 100644
> --- a/net/tls/Kconfig
> +++ b/net/tls/Kconfig
> @@ -13,3 +13,12 @@ config TLS
>  	encryption handling of the TLS protocol to be done in-kernel.
>  
>  	If unsure, say N.
> +
> +config TLS_DEVICE
> +	bool "Transport Layer Security HW offload"
> +	depends on TLS
> +	default n
> +	---help---
> +	Enable kernel support for HW offload of the TLS protocol.
> +
> +	If unsure, say N.
> diff --git a/net/tls/Makefile b/net/tls/Makefile
> index a930fd1c4f7b..44483cd47b3a 100644
> --- a/net/tls/Makefile
> +++ b/net/tls/Makefile
> @@ -5,3 +5,6 @@
>  obj-$(CONFIG_TLS) += tls.o
>  
>  tls-y := tls_main.o tls_sw.o
> +
> +tls-$(CONFIG_TLS_DEVICE) += tls_device.o tls_device_fallback.o
> +
> diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
> new file mode 100644
> index 000000000000..5082d693a503
> --- /dev/null
> +++ b/net/tls/tls_device.c
> @@ -0,0 +1,800 @@
> +/* Copyright (c) 2016-2017, Mellanox Technologies All rights reserved.
> + *
> + *     Redistribution and use in source and binary forms, with or
> + *     without modification, are permitted provided that the following
> + *     conditions are met:
> + *
> + *      - Redistributions of source code must retain the above
> + *        copyright notice, this list of conditions and the following
> + *        disclaimer.
> + *
> + *      - Redistributions in binary form must reproduce the above
> + *        copyright notice, this list of conditions and the following
> + *        disclaimer in the documentation and/or other materials
> + *        provided with the distribution.
> + *
> + *      - Neither the name of the Mellanox Technologies nor the
> + *        names of its contributors may be used to endorse or promote
> + *        products derived from this software without specific prior written
> + *        permission.
> + *
> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
> + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
> + * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + * A PARTICULAR PURPOSE ARE DISCLAIMED.
> + * IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
> + * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
> + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
> + * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
> + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
> + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
> + * IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
> + * POSSIBILITY OF SUCH DAMAGE
> + */
> +
> +#include <linux/module.h>
> +#include <net/tcp.h>
> +#include <net/inet_common.h>
> +#include <linux/highmem.h>
> +#include <linux/netdevice.h>
> +
> +#include <net/tls.h>
> +#include <crypto/aead.h>
> +
> +/* device_offload_lock is used to synchronize tls_dev_add
> + * against NETDEV_DOWN notifications.
> + */
> +DEFINE_STATIC_PERCPU_RWSEM(device_offload_lock);
> +
> +static void tls_device_gc_task(struct work_struct *work);
> +
> +static DECLARE_WORK(tls_device_gc_work, tls_device_gc_task);
> +static LIST_HEAD(tls_device_gc_list);
> +static LIST_HEAD(tls_device_list);
> +static DEFINE_SPINLOCK(tls_device_lock);
> +
> +static void tls_device_free_ctx(struct tls_context *ctx)
> +{
> +	struct tls_offload_context *offlad_ctx = tls_offload_ctx(ctx);
> +
> +	kfree(offlad_ctx);
> +	kfree(ctx);
> +}
> +
> +static void tls_device_gc_task(struct work_struct *work)
> +{
> +	struct tls_context *ctx, *tmp;
> +	struct list_head gc_list;
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&tls_device_lock, flags);
> +	INIT_LIST_HEAD(&gc_list);
> +	list_splice_init(&tls_device_gc_list, &gc_list);
> +	spin_unlock_irqrestore(&tls_device_lock, flags);
> +
> +	list_for_each_entry_safe(ctx, tmp, &gc_list, list) {
> +		struct net_device *netdev = ctx->netdev;
> +
> +		if (netdev) {
> +			netdev->tlsdev_ops->tls_dev_del(netdev, ctx,
> +							TLS_OFFLOAD_CTX_DIR_TX);
> +			dev_put(netdev);
> +		}
> +
> +		list_del(&ctx->list);
> +		tls_device_free_ctx(ctx);
> +	}
> +}
> +
> +static void tls_device_queue_ctx_destruction(struct tls_context *ctx)
> +{
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&tls_device_lock, flags);
> +	list_move_tail(&ctx->list, &tls_device_gc_list);
> +
> +	/* schedule_work inside the spinlock
> +	 * to make sure tls_device_down waits for that work.
> +	 */
> +	schedule_work(&tls_device_gc_work);
> +
> +	spin_unlock_irqrestore(&tls_device_lock, flags);
> +}
> +
> +/* We assume that the socket is already connected */
> +static struct net_device *get_netdev_for_sock(struct sock *sk)
> +{
> +	struct inet_sock *inet = inet_sk(sk);
> +	struct net_device *netdev = NULL;
> +
> +	netdev = dev_get_by_index(sock_net(sk), inet->cork.fl.flowi_oif);
> +
> +	return netdev;
> +}
> +
> +static int attach_sock_to_netdev(struct sock *sk, struct net_device *netdev,
> +				 struct tls_context *ctx)
> +{
> +	int rc;
> +
> +	rc = netdev->tlsdev_ops->tls_dev_add(netdev, sk, TLS_OFFLOAD_CTX_DIR_TX,
> +					     &ctx->crypto_send);
> +	if (rc) {
> +		pr_err("The netdev has refused to offload this socket\n");

_ratelimit here is probably welcomed, as this could be triggered by
users at will.

> +		goto out;
> +	}
> +
> +	rc = 0;
> +out:
> +	return rc;
> +}
> +
> +static void destroy_record(struct tls_record_info *record)
> +{
> +	skb_frag_t *frag;
> +	int nr_frags = record->num_frags;
> +
> +	while (nr_frags > 0) {
> +		frag = &record->frags[nr_frags - 1];
> +		__skb_frag_unref(frag);
> +		--nr_frags;
> +	}
> +	kfree(record);
> +}
> +
> +static void delete_all_records(struct tls_offload_context *offload_ctx)
> +{
> +	struct tls_record_info *info, *temp;
> +
> +	list_for_each_entry_safe(info, temp, &offload_ctx->records_list, list) {
> +		list_del(&info->list);
> +		destroy_record(info);
> +	}
> +
> +	offload_ctx->retransmit_hint = NULL;
> +}
> +
> +static void tls_icsk_clean_acked(struct sock *sk)
> +{
> +	struct tls_context *tls_ctx = tls_get_ctx(sk);
> +	struct tls_offload_context *ctx;
> +	struct tcp_sock *tp = tcp_sk(sk);
> +	struct tls_record_info *info, *temp;
> +	unsigned long flags;
> +	u64 deleted_records = 0;
> +
> +	if (!tls_ctx)
> +		return;
> +
> +	ctx = tls_offload_ctx(tls_ctx);
> +
> +	spin_lock_irqsave(&ctx->lock, flags);

Would be nice if this spinlock could be avoided somehow, as it's being
called right from tcp_ack().

> +	info = ctx->retransmit_hint;
> +	if (info && !before(tp->snd_una, info->end_seq)) {
> +		ctx->retransmit_hint = NULL;
> +		list_del(&info->list);
> +		destroy_record(info);
> +		deleted_records++;
> +	}
> +
> +	list_for_each_entry_safe(info, temp, &ctx->records_list, list) {
> +		if (before(tp->snd_una, info->end_seq))
> +			break;
> +		list_del(&info->list);
> +
> +		destroy_record(info);
> +		deleted_records++;
> +	}
> +
> +	ctx->unacked_record_sn += deleted_records;
> +	spin_unlock_irqrestore(&ctx->lock, flags);
> +}
> +
> +/* At this point, there should be no references on this
> + * socket and no in-flight SKBs associated with this
> + * socket, so it is safe to free all the resources.
> + */
> +void tls_device_sk_destruct(struct sock *sk)
> +{
> +	struct tls_context *tls_ctx = tls_get_ctx(sk);
> +	struct tls_offload_context *ctx = tls_offload_ctx(tls_ctx);
> +
> +	if (ctx->open_record)
> +		destroy_record(ctx->open_record);
> +
> +	delete_all_records(ctx);
> +	crypto_free_aead(ctx->aead_send);
> +	ctx->sk_destruct(sk);
> +
> +	if (refcount_dec_and_test(&tls_ctx->refcount))
> +		tls_device_queue_ctx_destruction(tls_ctx);
> +}
> +EXPORT_SYMBOL(tls_device_sk_destruct);
> +
> +static inline void tls_append_frag(struct tls_record_info *record,
> +				   struct page_frag *pfrag,
> +				   int size)
> +{
> +	skb_frag_t *frag;
> +
> +	frag = &record->frags[record->num_frags - 1];
> +	if (frag->page.p == pfrag->page &&
> +	    frag->page_offset + frag->size == pfrag->offset) {
> +		frag->size += size;
> +	} else {
> +		++frag;
> +		frag->page.p = pfrag->page;
> +		frag->page_offset = pfrag->offset;
> +		frag->size = size;
> +		++record->num_frags;
> +		get_page(pfrag->page);
> +	}
> +
> +	pfrag->offset += size;
> +	record->len += size;
> +}
> +
> +static inline int tls_push_record(struct sock *sk,
> +				  struct tls_context *ctx,
> +				  struct tls_offload_context *offload_ctx,
> +				  struct tls_record_info *record,
> +				  struct page_frag *pfrag,
> +				  int flags,
> +				  unsigned char record_type)
> +{
> +	skb_frag_t *frag;
> +	struct tcp_sock *tp = tcp_sk(sk);
> +	struct page_frag fallback_frag;
> +	struct page_frag  *tag_pfrag = pfrag;
> +	int i;
> +
> +	/* fill prepand */
> +	frag = &record->frags[0];
> +	tls_fill_prepend(ctx,
> +			 skb_frag_address(frag),
> +			 record->len - ctx->prepend_size,
> +			 record_type);
> +
> +	if (unlikely(!skb_page_frag_refill(ctx->tag_size, pfrag, GFP_KERNEL))) {
> +		/* HW doesn't care about the data in the tag
> +		 * so in case pfrag has no room
> +		 * for a tag and we can't allocate a new pfrag
> +		 * just use the page in the first frag
> +		 * rather then write a complicated fall back code.
> +		 */
> +		tag_pfrag = &fallback_frag;
> +		tag_pfrag->page = skb_frag_page(frag);
> +		tag_pfrag->offset = 0;
> +	}
> +
> +	tls_append_frag(record, tag_pfrag, ctx->tag_size);
> +	record->end_seq = tp->write_seq + record->len;
> +	spin_lock_irq(&offload_ctx->lock);
> +	list_add_tail(&record->list, &offload_ctx->records_list);
> +	spin_unlock_irq(&offload_ctx->lock);
> +	offload_ctx->open_record = NULL;
> +	set_bit(TLS_PENDING_CLOSED_RECORD, &ctx->flags);
> +	tls_advance_record_sn(sk, ctx);
> +
> +	for (i = 0; i < record->num_frags; i++) {
> +		frag = &record->frags[i];
> +		sg_unmark_end(&offload_ctx->sg_tx_data[i]);
> +		sg_set_page(&offload_ctx->sg_tx_data[i], skb_frag_page(frag),
> +			    frag->size, frag->page_offset);
> +		sk_mem_charge(sk, frag->size);
> +		get_page(skb_frag_page(frag));
> +	}
> +	sg_mark_end(&offload_ctx->sg_tx_data[record->num_frags - 1]);
> +
> +	/* all ready, send */
> +	return tls_push_sg(sk, ctx, offload_ctx->sg_tx_data, 0, flags);
> +}
> +
> +static inline int tls_create_new_record(struct tls_offload_context *offload_ctx,
> +					struct page_frag *pfrag,
> +					size_t prepend_size)
> +{
> +	skb_frag_t *frag;
> +	struct tls_record_info *record;
> +
> +	record = kmalloc(sizeof(*record), GFP_KERNEL);
> +	if (!record)
> +		return -ENOMEM;
> +
> +	frag = &record->frags[0];
> +	__skb_frag_set_page(frag, pfrag->page);
> +	frag->page_offset = pfrag->offset;
> +	skb_frag_size_set(frag, prepend_size);
> +
> +	get_page(pfrag->page);
> +	pfrag->offset += prepend_size;
> +
> +	record->num_frags = 1;
> +	record->len = prepend_size;
> +	offload_ctx->open_record = record;
> +	return 0;
> +}
> +
> +static inline int tls_do_allocation(struct sock *sk,
> +				    struct tls_offload_context *offload_ctx,
> +				    struct page_frag *pfrag,
> +				    size_t prepend_size)
> +{
> +	int ret;
> +
> +	if (!offload_ctx->open_record) {
> +		if (unlikely(!skb_page_frag_refill(prepend_size, pfrag,
> +						   sk->sk_allocation))) {
> +			sk->sk_prot->enter_memory_pressure(sk);
> +			sk_stream_moderate_sndbuf(sk);
> +			return -ENOMEM;
> +		}
> +
> +		ret = tls_create_new_record(offload_ctx, pfrag, prepend_size);
> +		if (ret)
> +			return ret;
> +
> +		if (pfrag->size > pfrag->offset)
> +			return 0;
> +	}
> +
> +	if (!sk_page_frag_refill(sk, pfrag))
> +		return -ENOMEM;
> +
> +	return 0;
> +}
> +
> +static int tls_push_data(struct sock *sk,
> +			 struct iov_iter *msg_iter,
> +			 size_t size, int flags,
> +			 unsigned char record_type)
> +{
> +	struct tls_context *tls_ctx = tls_get_ctx(sk);
> +	struct tls_offload_context *ctx = tls_offload_ctx(tls_ctx);
> +	struct tls_record_info *record = ctx->open_record;
> +	struct page_frag *pfrag;
> +	int copy, rc = 0;
> +	size_t orig_size = size;
> +	u32 max_open_record_len;
> +	long timeo;
> +	int more = flags & (MSG_SENDPAGE_NOTLAST | MSG_MORE);
> +	int tls_push_record_flags = flags | MSG_SENDPAGE_NOTLAST;
> +	bool done = false;
> +
> +	if (flags &
> +	    ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_SENDPAGE_NOTLAST))
> +		return -ENOTSUPP;
> +
> +	if (sk->sk_err)
> +		return -sk->sk_err;
> +
> +	timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT);
> +	rc = tls_complete_pending_work(sk, tls_ctx, flags, &timeo);
> +	if (rc < 0)
> +		return rc;
> +
> +	pfrag = sk_page_frag(sk);
> +
> +	/* KTLS_TLS_HEADER_SIZE is not counted as part of the TLS record, and
> +	 * we need to leave room for an authentication tag.
> +	 */
> +	max_open_record_len = TLS_MAX_PAYLOAD_SIZE +
> +			      tls_ctx->prepend_size;
> +	do {
> +		if (tls_do_allocation(sk, ctx, pfrag,
> +				      tls_ctx->prepend_size)) {
> +			rc = sk_stream_wait_memory(sk, &timeo);
> +			if (!rc)
> +				continue;
> +
> +			record = ctx->open_record;
> +			if (!record)
> +				break;
> +handle_error:
> +			if (record_type != TLS_RECORD_TYPE_DATA) {
> +				/* avoid sending partial
> +				 * record with type !=
> +				 * application_data
> +				 */
> +				size = orig_size;
> +				destroy_record(record);
> +				ctx->open_record = NULL;
> +			} else if (record->len > tls_ctx->prepend_size) {
> +				goto last_record;
> +			}
> +
> +			break;
> +		}
> +
> +		record = ctx->open_record;
> +		copy = min_t(size_t, size, (pfrag->size - pfrag->offset));
> +		copy = min_t(size_t, copy, (max_open_record_len - record->len));
> +
> +		if (copy_from_iter_nocache(page_address(pfrag->page) +
> +					       pfrag->offset,
> +					   copy, msg_iter) != copy) {
> +			rc = -EFAULT;
> +			goto handle_error;
> +		}
> +		tls_append_frag(record, pfrag, copy);
> +
> +		size -= copy;
> +		if (!size) {
> +last_record:
> +			tls_push_record_flags = flags;
> +			if (more) {
> +				tls_ctx->pending_open_record_frags =
> +						record->num_frags;
> +				break;
> +			}
> +
> +			done = true;
> +		}
> +
> +		if ((done) || record->len >= max_open_record_len ||
> +		    (record->num_frags >= MAX_SKB_FRAGS - 1)) {
> +			rc = tls_push_record(sk,
> +					     tls_ctx,
> +					     ctx,
> +					     record,
> +					     pfrag,
> +					     tls_push_record_flags,
> +					     record_type);
> +			if (rc < 0)
> +				break;
> +		}
> +	} while (!done);
> +
> +	if (orig_size - size > 0)
> +		rc = orig_size - size;
> +
> +	return rc;
> +}
> +
> +int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
> +{
> +	unsigned char record_type = TLS_RECORD_TYPE_DATA;
> +	int rc = 0;
> +
> +	lock_sock(sk);
> +
> +	if (unlikely(msg->msg_controllen)) {
> +		rc = tls_proccess_cmsg(sk, msg, &record_type);
> +		if (rc)
> +			goto out;
> +	}
> +
> +	rc = tls_push_data(sk, &msg->msg_iter, size,
> +			   msg->msg_flags, record_type);
> +
> +out:
> +	release_sock(sk);
> +	return rc;
> +}
> +
> +int tls_device_sendpage(struct sock *sk, struct page *page,
> +			int offset, size_t size, int flags)
> +{
> +	struct iov_iter	msg_iter;
> +	struct kvec iov;
> +	char *kaddr = kmap(page);
> +	int rc = 0;
> +
> +	if (flags & MSG_SENDPAGE_NOTLAST)
> +		flags |= MSG_MORE;
> +
> +	lock_sock(sk);
> +
> +	if (flags & MSG_OOB) {
> +		rc = -ENOTSUPP;
> +		goto out;
> +	}
> +
> +	iov.iov_base = kaddr + offset;
> +	iov.iov_len = size;
> +	iov_iter_kvec(&msg_iter, WRITE | ITER_KVEC, &iov, 1, size);
> +	rc = tls_push_data(sk, &msg_iter, size,
> +			   flags, TLS_RECORD_TYPE_DATA);
> +	kunmap(page);
> +
> +out:
> +	release_sock(sk);
> +	return rc;
> +}
> +
> +struct tls_record_info *tls_get_record(struct tls_offload_context *context,
> +				       u32 seq, u64 *p_record_sn)
> +{
> +	struct tls_record_info *info;
> +	u64 record_sn = context->hint_record_sn;
> +
> +	info = context->retransmit_hint;
> +	if (!info ||
> +	    before(seq, info->end_seq - info->len)) {
> +		/* if retransmit_hint is irrelevant start
> +		 * from the begging of the list
> +		 */
> +		info = list_first_entry(&context->records_list,
> +					struct tls_record_info, list);
> +		record_sn = context->unacked_record_sn;
> +	}
> +
> +	list_for_each_entry_from(info, &context->records_list, list) {
> +		if (before(seq, info->end_seq)) {
> +			if (!context->retransmit_hint ||
> +			    after(info->end_seq,
> +				  context->retransmit_hint->end_seq)) {
> +				context->hint_record_sn = record_sn;
> +				context->retransmit_hint = info;
> +			}
> +			*p_record_sn = record_sn;
> +			return info;
> +		}
> +		record_sn++;
> +	}
> +
> +	return NULL;
> +}
> +EXPORT_SYMBOL(tls_get_record);
> +
> +static int tls_device_push_pending_record(struct sock *sk, int flags)
> +{
> +	struct iov_iter	msg_iter;
> +
> +	iov_iter_kvec(&msg_iter, WRITE | ITER_KVEC, NULL, 0, 0);
> +	return tls_push_data(sk, &msg_iter, 0, flags, TLS_RECORD_TYPE_DATA);
> +}
> +
> +int tls_set_device_offload(struct sock *sk, struct tls_context *ctx)
> +{
> +	struct tls_crypto_info *crypto_info;
> +	struct tls_offload_context *offload_ctx;
> +	struct tls_record_info *start_marker_record;
> +	u16 nonece_size, tag_size, iv_size, rec_seq_size;
> +	char *iv, *rec_seq;
> +	int rc;
> +	struct net_device *netdev;
> +	struct sk_buff *skb;
> +
> +	if (!ctx) {
> +		rc = -EINVAL;
> +		goto out;
> +	}
> +
> +	if (ctx->priv_ctx) {
> +		rc = -EEXIST;
> +		goto out;
> +	}
> +
> +	/* We support starting offload on multiple sockets
> +	 * concurrently, So we only need a read lock here.
> +	 */
> +	percpu_down_read(&device_offload_lock);
> +	netdev = get_netdev_for_sock(sk);
> +	if (!netdev) {
> +		pr_err("%s: netdev not found\n", __func__);

_ratelimit?

> +		rc = -EINVAL;
> +		goto release_lock;
> +	}
> +

  Marcelo

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 net-next 6/6] tls: Add generic NIC offload infrastructure.
  2017-12-18 11:10 ` [PATCH v3 net-next 6/6] tls: Add generic NIC offload infrastructure Ilya Lesokhin
  2017-12-18 19:53   ` Marcelo Ricardo Leitner
@ 2017-12-19  7:00   ` kbuild test robot
  2017-12-19  7:01   ` kbuild test robot
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 35+ messages in thread
From: kbuild test robot @ 2017-12-19  7:00 UTC (permalink / raw)
  To: Ilya Lesokhin
  Cc: kbuild-all, netdev, davem, davejwatson, tom, hannes, borisp,
	aviadye, liranl, Ilya Lesokhin

[-- Attachment #1: Type: text/plain, Size: 5992 bytes --]

Hi Ilya,

I love your patch! Perhaps something to improve:

[auto build test WARNING on net-next/master]

url:    https://github.com/0day-ci/linux/commits/Ilya-Lesokhin/tls-Add-generic-NIC-offload-infrastructure/20171219-140819
config: xtensa-allmodconfig (attached as .config)
compiler: xtensa-linux-gcc (GCC) 7.2.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=xtensa 

All warnings (new ones prefixed by >>):

   net//tls/tls_device_fallback.c: In function 'tls_sw_fallback':
>> net//tls/tls_device_fallback.c:360:1: warning: the frame size of 1040 bytes is larger than 1024 bytes [-Wframe-larger-than=]
    }
    ^

vim +360 net//tls/tls_device_fallback.c

   214	
   215	/* This function may be called after the user socket is already
   216	 * closed so make sure we don't use anything freed during
   217	 * tls_sk_proto_close here
   218	 */
   219	struct sk_buff *tls_sw_fallback(struct sock *sk, struct sk_buff *skb)
   220	{
   221		int tcp_header_size = tcp_hdrlen(skb);
   222		int tcp_payload_offset = skb_transport_offset(skb) + tcp_header_size;
   223		int payload_len = skb->len - tcp_payload_offset;
   224		struct tls_context *tls_ctx = tls_get_ctx(sk);
   225		struct tls_offload_context *ctx = tls_offload_ctx(tls_ctx);
   226		int remaining, buf_len, resync_sgs, rc, i = 0;
   227		void *buf, *dummy_buf, *iv, *aad;
   228		struct scatterlist sg_in[2 * (MAX_SKB_FRAGS + 1)];
   229		struct scatterlist sg_out[3];
   230		u32 tcp_seq = ntohl(tcp_hdr(skb)->seq);
   231		struct aead_request *aead_req;
   232		struct sk_buff *nskb = NULL;
   233		struct tls_record_info *record;
   234		unsigned long flags;
   235		s32 sync_size;
   236		u64 rcd_sn;
   237	
   238		if (!payload_len)
   239			return skb;
   240	
   241		sg_init_table(sg_in, ARRAY_SIZE(sg_in));
   242		sg_init_table(sg_out, ARRAY_SIZE(sg_out));
   243	
   244		spin_lock_irqsave(&ctx->lock, flags);
   245		record = tls_get_record(ctx, tcp_seq, &rcd_sn);
   246		if (!record) {
   247			spin_unlock_irqrestore(&ctx->lock, flags);
   248			WARN(1, "Record not found for seq %u\n", tcp_seq);
   249			goto free_orig;
   250		}
   251	
   252		sync_size = tcp_seq - tls_record_start_seq(record);
   253		if (sync_size < 0) {
   254			int is_start_marker = tls_record_is_start_marker(record);
   255	
   256			spin_unlock_irqrestore(&ctx->lock, flags);
   257			if (!is_start_marker)
   258			/* This should only occur if the relevant record was
   259			 * already acked. In that case it should be ok
   260			 * to drop the packet and avoid retransmission.
   261			 *
   262			 * There is a corner case where the packet contains
   263			 * both an acked and a non-acked record.
   264			 * We currently don't handle that case and rely
   265			 * on TCP to retranmit a packet that doesn't contain
   266			 * already acked payload.
   267			 */
   268				goto free_orig;
   269	
   270			if (payload_len > -sync_size) {
   271				WARN(1, "Fallback of partially offloaded packets is not supported\n");
   272				goto free_orig;
   273			} else {
   274				return skb;
   275			}
   276		}
   277	
   278		remaining = sync_size;
   279		while (remaining > 0) {
   280			skb_frag_t *frag = &record->frags[i];
   281	
   282			__skb_frag_ref(frag);
   283			sg_set_page(sg_in + i, skb_frag_page(frag),
   284				    skb_frag_size(frag), frag->page_offset);
   285	
   286			remaining -= skb_frag_size(frag);
   287	
   288			if (remaining < 0)
   289				sg_in[i].length += remaining;
   290	
   291			i++;
   292		}
   293		spin_unlock_irqrestore(&ctx->lock, flags);
   294		resync_sgs = i;
   295	
   296		aead_req = tls_alloc_aead_request(ctx->aead_send, GFP_ATOMIC);
   297		if (!aead_req)
   298			goto put_sg;
   299	
   300		buf_len = TLS_CIPHER_AES_GCM_128_SALT_SIZE +
   301			  TLS_CIPHER_AES_GCM_128_IV_SIZE +
   302			  TLS_AAD_SPACE_SIZE +
   303			  sync_size +
   304			  tls_ctx->tag_size;
   305		buf = kmalloc(buf_len, GFP_ATOMIC);
   306		if (!buf)
   307			goto free_req;
   308	
   309		nskb = alloc_skb(skb_headroom(skb) + skb->len, GFP_ATOMIC);
   310		if (!nskb)
   311			goto free_req;
   312	
   313		skb_reserve(nskb, skb_headroom(skb));
   314	
   315		iv = buf;
   316	
   317		memcpy(iv, tls_ctx->crypto_send_aes_gcm_128.salt,
   318		       TLS_CIPHER_AES_GCM_128_SALT_SIZE);
   319		aad = buf + TLS_CIPHER_AES_GCM_128_SALT_SIZE +
   320		      TLS_CIPHER_AES_GCM_128_IV_SIZE;
   321		dummy_buf = aad + TLS_AAD_SPACE_SIZE;
   322	
   323		sg_set_buf(&sg_out[0], dummy_buf, sync_size);
   324		sg_set_buf(&sg_out[1], nskb->data + tcp_payload_offset,
   325			   payload_len);
   326		/* Add room for authentication tag produced by crypto */
   327		dummy_buf += sync_size;
   328		sg_set_buf(&sg_out[2], dummy_buf, tls_ctx->tag_size);
   329		rc = skb_to_sgvec(skb, &sg_in[i], tcp_payload_offset,
   330				  payload_len);
   331		if (rc < 0)
   332			goto free_nskb;
   333	
   334		rc = tls_enc_records(aead_req, ctx->aead_send, sg_in, sg_out, aad, iv,
   335				     rcd_sn, sync_size + payload_len);
   336		if (rc < 0)
   337			goto free_nskb;
   338	
   339		complete_skb(nskb, skb, tcp_payload_offset);
   340	
   341		/* validate_xmit_skb_list assumes that if the skb wasn't segmented
   342		 * nskb->prev will point to the skb itself
   343		 */
   344		nskb->prev = nskb;
   345	free_buf:
   346		kfree(buf);
   347	free_req:
   348		kfree(aead_req);
   349	put_sg:
   350		for (i = 0; i < resync_sgs; i++)
   351			put_page(sg_page(&sg_in[i]));
   352	free_orig:
   353		kfree_skb(skb);
   354		return nskb;
   355	
   356	free_nskb:
   357		kfree_skb(nskb);
   358		nskb = NULL;
   359		goto free_buf;
 > 360	}
   361	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 52638 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 net-next 6/6] tls: Add generic NIC offload infrastructure.
  2017-12-18 11:10 ` [PATCH v3 net-next 6/6] tls: Add generic NIC offload infrastructure Ilya Lesokhin
  2017-12-18 19:53   ` Marcelo Ricardo Leitner
  2017-12-19  7:00   ` kbuild test robot
@ 2017-12-19  7:01   ` kbuild test robot
  2017-12-19  8:17   ` [RFC PATCH] tls: tls_sw_fallback() can be static kbuild test robot
  2017-12-19  8:17   ` [PATCH v3 net-next 6/6] tls: Add generic NIC offload infrastructure kbuild test robot
  4 siblings, 0 replies; 35+ messages in thread
From: kbuild test robot @ 2017-12-19  7:01 UTC (permalink / raw)
  To: Ilya Lesokhin
  Cc: kbuild-all, netdev, davem, davejwatson, tom, hannes, borisp,
	aviadye, liranl, Ilya Lesokhin

[-- Attachment #1: Type: text/plain, Size: 2352 bytes --]

Hi Ilya,

I love your patch! Yet something to improve:

[auto build test ERROR on net-next/master]

url:    https://github.com/0day-ci/linux/commits/Ilya-Lesokhin/tls-Add-generic-NIC-offload-infrastructure/20171219-140819
config: tile-allmodconfig (attached as .config)
compiler: tilegx-linux-gcc (GCC) 7.2.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=tile 

All errors (new ones prefixed by >>):

   net/tls/tls_device_fallback.c: In function 'update_chksum':
>> net/tls/tls_device_fallback.c:190:16: error: implicit declaration of function 'csum_ipv6_magic'; did you mean 'csum_tcpudp_magic'? [-Werror=implicit-function-declaration]
      th->check = ~csum_ipv6_magic(&ipv6h->saddr, &ipv6h->daddr,
                   ^~~~~~~~~~~~~~~
                   csum_tcpudp_magic
   cc1: some warnings being treated as errors

vim +190 net/tls/tls_device_fallback.c

   166	
   167	static inline void update_chksum(struct sk_buff *skb, int headln)
   168	{
   169		/* Can't use icsk->icsk_af_ops->send_check here because the ip addresses
   170		 * might have been changed by NAT.
   171		 */
   172	
   173		const struct ipv6hdr *ipv6h;
   174		const struct iphdr *iph;
   175		struct tcphdr *th = tcp_hdr(skb);
   176		int datalen = skb->len - headln;
   177	
   178		/* We only changed the payload so if we are using partial we don't
   179		 * need to update anything.
   180		 */
   181		if (likely(skb->ip_summed == CHECKSUM_PARTIAL))
   182			return;
   183	
   184		skb->ip_summed = CHECKSUM_PARTIAL;
   185		skb->csum_start = skb_transport_header(skb) - skb->head;
   186		skb->csum_offset = offsetof(struct tcphdr, check);
   187	
   188		if (skb->sk->sk_family == AF_INET6) {
   189			ipv6h = ipv6_hdr(skb);
 > 190			th->check = ~csum_ipv6_magic(&ipv6h->saddr, &ipv6h->daddr,
   191						     datalen, IPPROTO_TCP, 0);
   192		} else {
   193			iph = ip_hdr(skb);
   194			th->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, datalen,
   195						       IPPROTO_TCP, 0);
   196		}
   197	}
   198	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 51390 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH v3 net-next 6/6] tls: Add generic NIC offload infrastructure.
  2017-12-18 19:53   ` Marcelo Ricardo Leitner
@ 2017-12-19  7:31     ` Ilya Lesokhin
  2017-12-19 15:11       ` Marcelo Ricardo Leitner
  0 siblings, 1 reply; 35+ messages in thread
From: Ilya Lesokhin @ 2017-12-19  7:31 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner
  Cc: netdev, davem, davejwatson, tom, hannes, Boris Pismenny,
	Aviad Yehezkel, Liran Liss

On Mon, Monday, December 18, 2017 9:54 PM, Marcelo Ricardo Leitner wrote:

> On Mon, Dec 18, 2017 at 01:10:33PM +0200, Ilya Lesokhin wrote:
> > This patch adds a generic infrastructure to offload TLS crypto to a
> > network devices. It enables the kernel TLS socket to skip encryption
> > and authentication operations on the transmit side of the data path.
> > Leaving those computationally expensive operations to the NIC.
> 
> I have a hard time understanding why this was named 'tls_device' if no
> net_device's are registered.
> 
I'm not quite sure what you mean by "no net_device's are registered"
Presumably you mean there is no device that implements the 
NETIF_F_HW_TLS_TX capability yet.
I'll just say that the IPSEC device offload infrastructure was also submitted
https://github.com/torvalds/linux/commit/d77e38e612a017480157fe6d2c1422f42cb5b7e3
before the first implementation
https://github.com/torvalds/linux/commit/bebb23e6cb02d2fc752905e39d09ff6152852c6c

And we did provide a link to an implementation 
https://github.com/Mellanox/tls-offload/tree/tls_device_v3
for people who want to take a look.
Unfortunately it is not ready for upstream submission yet


> > +	percpu_down_read(&device_offload_lock);
> > +	netdev = get_netdev_for_sock(sk);
> > +	if (!netdev) {
> > +		pr_err("%s: netdev not found\n", __func__);
> 
> _ratelimit?
> 

Thanks, we will fix it in the future.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH v3 net-next 3/6] net: Add SW fallback infrastructure for offloaded sockets
  2017-12-18 19:18   ` Marcelo Ricardo Leitner
@ 2017-12-19  7:51     ` Ilya Lesokhin
  2017-12-19 15:05       ` Marcelo Ricardo Leitner
  0 siblings, 1 reply; 35+ messages in thread
From: Ilya Lesokhin @ 2017-12-19  7:51 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner
  Cc: netdev, davem, davejwatson, tom, hannes, Boris Pismenny,
	Aviad Yehezkel, Liran Liss, Steffen Klassert

On Monday, December 18, 2017 9:18 PM, Marcelo Ricardo Leitner wrote:

> > +
> > +	if (sk && sk_fullsock(sk) && sk->sk_offload_check)
> 
> Isn't this going to hurt the fast path, checking for sk fields here?
> 

We do add code to the fast path but it seems unavoidable if you want to have SW fallback.
The XFRM device offload also does that
http://elixir.free-electrons.com/linux/v4.14.7/source/net/core/dev.c#L3058

The check can be optimized but I didn't want to do that before I saw that it's an issue.
I'm also not sure what the correct solution is.
I don't like that fact that each "stateful protocol" we offload requires its own check. 
We need to think if we can find a generic way of doing it.

Perhaps we can hold the expected netdev somewhere in the SKB and only if we don't
Go out of the expected netdev go to a slow path that does a check for each protocol.

Thanks,
Ilya

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFC PATCH] tls: tls_sw_fallback() can be static
  2017-12-18 11:10 ` [PATCH v3 net-next 6/6] tls: Add generic NIC offload infrastructure Ilya Lesokhin
                     ` (2 preceding siblings ...)
  2017-12-19  7:01   ` kbuild test robot
@ 2017-12-19  8:17   ` kbuild test robot
  2017-12-19  8:17   ` [PATCH v3 net-next 6/6] tls: Add generic NIC offload infrastructure kbuild test robot
  4 siblings, 0 replies; 35+ messages in thread
From: kbuild test robot @ 2017-12-19  8:17 UTC (permalink / raw)
  To: Ilya Lesokhin
  Cc: kbuild-all, netdev, davem, davejwatson, tom, hannes, borisp,
	aviadye, liranl, Ilya Lesokhin


Fixes: 716b5be4af2a ("tls: Add generic NIC offload infrastructure.")
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
---
 tls_device_fallback.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/tls/tls_device_fallback.c b/net/tls/tls_device_fallback.c
index 95d04d6..d2403d6e 100644
--- a/net/tls/tls_device_fallback.c
+++ b/net/tls/tls_device_fallback.c
@@ -216,7 +216,7 @@ static void complete_skb(struct sk_buff *nskb, struct sk_buff *skb, int headln)
  * closed so make sure we don't use anything freed during
  * tls_sk_proto_close here
  */
-struct sk_buff *tls_sw_fallback(struct sock *sk, struct sk_buff *skb)
+static struct sk_buff *tls_sw_fallback(struct sock *sk, struct sk_buff *skb)
 {
 	int tcp_header_size = tcp_hdrlen(skb);
 	int tcp_payload_offset = skb_transport_offset(skb) + tcp_header_size;

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 net-next 6/6] tls: Add generic NIC offload infrastructure.
  2017-12-18 11:10 ` [PATCH v3 net-next 6/6] tls: Add generic NIC offload infrastructure Ilya Lesokhin
                     ` (3 preceding siblings ...)
  2017-12-19  8:17   ` [RFC PATCH] tls: tls_sw_fallback() can be static kbuild test robot
@ 2017-12-19  8:17   ` kbuild test robot
  4 siblings, 0 replies; 35+ messages in thread
From: kbuild test robot @ 2017-12-19  8:17 UTC (permalink / raw)
  To: Ilya Lesokhin
  Cc: kbuild-all, netdev, davem, davejwatson, tom, hannes, borisp,
	aviadye, liranl, Ilya Lesokhin

Hi Ilya,

I love your patch! Perhaps something to improve:

[auto build test WARNING on net-next/master]

url:    https://github.com/0day-ci/linux/commits/Ilya-Lesokhin/tls-Add-generic-NIC-offload-infrastructure/20171219-140819
reproduce:
        # apt-get install sparse
        make ARCH=x86_64 allmodconfig
        make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by >>)


Please review and possibly fold the followup patch.

vim +649 net/tls/tls_device.c

   547	
   548	int tls_set_device_offload(struct sock *sk, struct tls_context *ctx)
   549	{
   550		struct tls_crypto_info *crypto_info;
   551		struct tls_offload_context *offload_ctx;
   552		struct tls_record_info *start_marker_record;
   553		u16 nonece_size, tag_size, iv_size, rec_seq_size;
   554		char *iv, *rec_seq;
   555		int rc;
   556		struct net_device *netdev;
   557		struct sk_buff *skb;
   558	
   559		if (!ctx) {
   560			rc = -EINVAL;
   561			goto out;
   562		}
   563	
   564		if (ctx->priv_ctx) {
   565			rc = -EEXIST;
   566			goto out;
   567		}
   568	
   569		/* We support starting offload on multiple sockets
   570		 * concurrently, So we only need a read lock here.
   571		 */
   572		percpu_down_read(&device_offload_lock);
   573		netdev = get_netdev_for_sock(sk);
   574		if (!netdev) {
   575			pr_err("%s: netdev not found\n", __func__);
   576			rc = -EINVAL;
   577			goto release_lock;
   578		}
   579	
   580		if (!(netdev->features & NETIF_F_HW_TLS_TX)) {
   581			rc = -ENOTSUPP;
   582			goto release_netdev;
   583		}
   584	
   585		/* Avoid offloading if the device is down
   586		 * We don't want to offload new flows after
   587		 * the NETDEV_DOWN event
   588		 */
   589		if (!(netdev->flags & IFF_UP)) {
   590			rc = -EINVAL;
   591			goto release_lock;
   592		}
   593	
   594		crypto_info = &ctx->crypto_send;
   595		switch (crypto_info->cipher_type) {
   596		case TLS_CIPHER_AES_GCM_128: {
   597			nonece_size = TLS_CIPHER_AES_GCM_128_IV_SIZE;
   598			tag_size = TLS_CIPHER_AES_GCM_128_TAG_SIZE;
   599			iv_size = TLS_CIPHER_AES_GCM_128_IV_SIZE;
   600			iv = ((struct tls12_crypto_info_aes_gcm_128 *)crypto_info)->iv;
   601			rec_seq_size = TLS_CIPHER_AES_GCM_128_REC_SEQ_SIZE;
   602			rec_seq =
   603			 ((struct tls12_crypto_info_aes_gcm_128 *)crypto_info)->rec_seq;
   604			break;
   605		}
   606		default:
   607			rc = -EINVAL;
   608			goto release_netdev;
   609		}
   610	
   611		start_marker_record = kmalloc(sizeof(*start_marker_record), GFP_KERNEL);
   612		if (!start_marker_record) {
   613			rc = -ENOMEM;
   614			goto release_netdev;
   615		}
   616	
   617		rc = attach_sock_to_netdev(sk, netdev, ctx);
   618		if (rc)
   619			goto free_marker_record;
   620	
   621		ctx->netdev = netdev;
   622	
   623		ctx->prepend_size = TLS_HEADER_SIZE + nonece_size;
   624		ctx->tag_size = tag_size;
   625		ctx->iv_size = iv_size;
   626		ctx->iv = kmalloc(iv_size + TLS_CIPHER_AES_GCM_128_SALT_SIZE,
   627				  GFP_KERNEL);
   628		if (!ctx->iv) {
   629			rc = -ENOMEM;
   630			goto detach_sock;
   631		}
   632	
   633		memcpy(ctx->iv + TLS_CIPHER_AES_GCM_128_SALT_SIZE, rec_seq, iv_size);
   634	
   635		ctx->rec_seq_size = rec_seq_size;
   636		ctx->rec_seq = kmalloc(rec_seq_size, GFP_KERNEL);
   637		if (!ctx->rec_seq) {
   638			rc = -ENOMEM;
   639			goto err_iv;
   640		}
   641		memcpy(ctx->rec_seq, rec_seq, rec_seq_size);
   642	
   643		offload_ctx = ctx->priv_ctx;
   644		memcpy(&offload_ctx->unacked_record_sn, rec_seq,
   645		       sizeof(offload_ctx->unacked_record_sn));
   646	
   647		/* start at rec_seq -1 to account for the start marker record */
   648		offload_ctx->unacked_record_sn =
 > 649			be64_to_cpu(offload_ctx->unacked_record_sn) - 1;
   650	
   651		rc = tls_sw_fallback_init(sk, offload_ctx, crypto_info);
   652		if (rc)
   653			goto err_iv;
   654	
   655		start_marker_record->end_seq = tcp_sk(sk)->write_seq;
   656		start_marker_record->len = 0;
   657		start_marker_record->num_frags = 0;
   658	
   659		INIT_LIST_HEAD(&offload_ctx->records_list);
   660		list_add_tail(&start_marker_record->list, &offload_ctx->records_list);
   661		spin_lock_init(&offload_ctx->lock);
   662	
   663		inet_csk(sk)->icsk_clean_acked = &tls_icsk_clean_acked;
   664		ctx->push_pending_record = tls_device_push_pending_record;
   665		offload_ctx->sk_destruct = sk->sk_destruct;
   666	
   667		/* TLS offload is greatly simplified if we don't send
   668		 * SKBs where only part of the payload needs to be encrypted.
   669		 * So mark the last skb in the write queue as end of record.
   670		 */
   671		skb = tcp_write_queue_tail(sk);
   672		if (skb)
   673			TCP_SKB_CB(skb)->eor = 1;
   674	
   675		refcount_set(&ctx->refcount, 1);
   676		spin_lock_irq(&tls_device_lock);
   677		list_add_tail(&ctx->list, &tls_device_list);
   678		spin_unlock_irq(&tls_device_lock);
   679	
   680		/* following this assignment tls_is_sk_tx_device_offloaded
   681		 * will return true and the context might be accessed
   682		 * by the netdev's xmit function.
   683		 */
   684		smp_store_release(&sk->sk_destruct,
   685				  &tls_device_sk_destruct);
   686		goto release_lock;
   687	
   688	err_iv:
   689		kfree(ctx->iv);
   690	detach_sock:
   691		netdev->tlsdev_ops->tls_dev_del(netdev, ctx, TLS_OFFLOAD_CTX_DIR_TX);
   692	free_marker_record:
   693		kfree(start_marker_record);
   694	release_netdev:
   695		dev_put(netdev);
   696	release_lock:
   697		percpu_up_read(&device_offload_lock);
   698	out:
   699		return rc;
   700	}
   701	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 net-next 0/6] tls: Add generic NIC offload infrastructure
  2017-12-18 17:10 ` [PATCH v3 net-next 0/6] " Jiri Pirko
@ 2017-12-19 10:30   ` Jiri Pirko
  2017-12-20  8:28     ` Boris Pismenny
  0 siblings, 1 reply; 35+ messages in thread
From: Jiri Pirko @ 2017-12-19 10:30 UTC (permalink / raw)
  To: Ilya Lesokhin
  Cc: netdev, davem, davejwatson, tom, hannes, borisp, aviadye, liranl

Mon, Dec 18, 2017 at 06:10:10PM CET, jiri@resnulli.us wrote:
>Mon, Dec 18, 2017 at 12:10:27PM CET, ilyal@mellanox.com wrote:
>>Changes from v2:
>>- Fix sk use after free and possible netdev use after free
>>- tls device now keeps a refernce on the offloading netdev
>>- tls device registers to the netdev notifer. 
>>  Upon a NETDEV_DOWN event, offload is stopped and
>>  the reference on the netdev is dropped.
>>- SW fallback support for skb->ip_summed != CHECKSUM_PARTIAL 
>>- Merged TLS patches are no longer part of this series.
>>
>>Changes from v1:
>>- Remove the binding of the socket to a specific netdev 
>>  through sk->sk_bound_dev_if.
>>  Add a check in validate_xmit_skb to detect route changes
>>  and call SW fallback code to do the crypto in software.
>>- tls_get_record now returns the tls record sequence number.
>>  This is required to support connections with rcd_sn != iv.
>>- Bug fixes to the TLS code.
>>
>>This patchset adds a generic infrastructure to offload TLS crypto to a
>>network devices.
>>
>>patches 1-2 Export functions that we need
>>patch 3 adds infrastructue for offloaded socket fallback
>>patches 4-5 add new NDOs and capabilities.
>>patch 6 adds the TLS NIC offload infrastructure.
>>
>>Github with mlx5e TLS offload support:
>>https://github.com/Mellanox/tls-offload/tree/tls_device_v3
>
>I don't get it. You are pushing infra but not the actual driver part
>who is consuming the infra? Why?

Okay. Since the driver that uses the API introduced by this patchset
is missing, this patchset should be marked as RFC.

Dave, I see that you were about to apply v2. I'm sure you missed this.
Thanks.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 net-next 3/6] net: Add SW fallback infrastructure for offloaded sockets
  2017-12-19  7:51     ` Ilya Lesokhin
@ 2017-12-19 15:05       ` Marcelo Ricardo Leitner
  0 siblings, 0 replies; 35+ messages in thread
From: Marcelo Ricardo Leitner @ 2017-12-19 15:05 UTC (permalink / raw)
  To: Ilya Lesokhin
  Cc: netdev, davem, davejwatson, tom, hannes, Boris Pismenny,
	Aviad Yehezkel, Liran Liss, Steffen Klassert

On Tue, Dec 19, 2017 at 07:51:02AM +0000, Ilya Lesokhin wrote:
> On Monday, December 18, 2017 9:18 PM, Marcelo Ricardo Leitner wrote:
> 
> > > +
> > > +	if (sk && sk_fullsock(sk) && sk->sk_offload_check)
> > 
> > Isn't this going to hurt the fast path, checking for sk fields here?
> > 
> 
> We do add code to the fast path but it seems unavoidable if you want to have SW fallback.
> The XFRM device offload also does that
> http://elixir.free-electrons.com/linux/v4.14.7/source/net/core/dev.c#L3058

Right, although a bit different. It's accessing skb->sp and not the
socket and depending on how compiler is doing things, the check
http://elixir.free-electrons.com/linux/v4.14.7/source/net/xfrm/xfrm_device.c#L32
will help in some cases.

But more importantly, all the above only exists if CONFIG_XFRM_OFFLOAD
is enabled.

> 
> The check can be optimized but I didn't want to do that before I saw that it's an issue.
> I'm also not sure what the correct solution is.
> I don't like that fact that each "stateful protocol" we offload requires its own check. 
> We need to think if we can find a generic way of doing it.
> 
> Perhaps we can hold the expected netdev somewhere in the SKB and only if we don't
> Go out of the expected netdev go to a slow path that does a check for each protocol.

This could be a good switch, yes.

Thanks,
Marcelo

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 net-next 6/6] tls: Add generic NIC offload infrastructure.
  2017-12-19  7:31     ` Ilya Lesokhin
@ 2017-12-19 15:11       ` Marcelo Ricardo Leitner
  2017-12-19 15:38         ` Ilya Lesokhin
  0 siblings, 1 reply; 35+ messages in thread
From: Marcelo Ricardo Leitner @ 2017-12-19 15:11 UTC (permalink / raw)
  To: Ilya Lesokhin
  Cc: netdev, davem, davejwatson, tom, hannes, Boris Pismenny,
	Aviad Yehezkel, Liran Liss

On Tue, Dec 19, 2017 at 07:31:24AM +0000, Ilya Lesokhin wrote:
> On Mon, Monday, December 18, 2017 9:54 PM, Marcelo Ricardo Leitner wrote:
> 
> > On Mon, Dec 18, 2017 at 01:10:33PM +0200, Ilya Lesokhin wrote:
> > > This patch adds a generic infrastructure to offload TLS crypto to a
> > > network devices. It enables the kernel TLS socket to skip encryption
> > > and authentication operations on the transmit side of the data path.
> > > Leaving those computationally expensive operations to the NIC.
> > 
> > I have a hard time understanding why this was named 'tls_device' if no
> > net_device's are registered.
> > 
> I'm not quite sure what you mean by "no net_device's are registered"
> Presumably you mean there is no device that implements the 
> NETIF_F_HW_TLS_TX capability yet.

Not really. Let me try again. This patchset is using the expression
"tls_device". When I read that, I expect a new interface type, like a
tunnel, that would be created on top of another interface that has the
offloading capability. That's why I'm confused. IMHO "tls_offload" is
a better fit. Makes sense?

> I'll just say that the IPSEC device offload infrastructure was also submitted
> https://github.com/torvalds/linux/commit/d77e38e612a017480157fe6d2c1422f42cb5b7e3
> before the first implementation
> https://github.com/torvalds/linux/commit/bebb23e6cb02d2fc752905e39d09ff6152852c6c
> 
> And we did provide a link to an implementation 
> https://github.com/Mellanox/tls-offload/tree/tls_device_v3
> for people who want to take a look.
> Unfortunately it is not ready for upstream submission yet

Yep, although I still have to get there.

Thanks,
Marcelo

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH v3 net-next 6/6] tls: Add generic NIC offload infrastructure.
  2017-12-19 15:11       ` Marcelo Ricardo Leitner
@ 2017-12-19 15:38         ` Ilya Lesokhin
  2017-12-19 16:18           ` Marcelo Ricardo Leitner
  0 siblings, 1 reply; 35+ messages in thread
From: Ilya Lesokhin @ 2017-12-19 15:38 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner
  Cc: netdev, davem, davejwatson, tom, hannes, Boris Pismenny,
	Aviad Yehezkel, Liran Liss

Tuesday, December 19, 2017 5:12 PM, Marcelo Ricardo Leitner wrote:

> > I'm not quite sure what you mean by "no net_device's are registered"
> > Presumably you mean there is no device that implements the
> > NETIF_F_HW_TLS_TX capability yet.
> 
> Not really. Let me try again. This patchset is using the expression "tls_device".
> When I read that, I expect a new interface type, like a tunnel, that would be
> created on top of another interface that has the offloading capability. That's
> why I'm confused. IMHO "tls_offload" is a better fit. Makes sense?
> 

We don't expose a new interface. An existing netdev does the offload.

The xfrm layer also calls the offload layer xfrm_device and It also doesn't need to
add another interface to offload ipsec to a netdev.

I thought about calling it tls_hw or tls_hw_offload.
The problem is that the important distinction here is that the 
offload is done by a netdev.
tls_sw can also use hw offload if you have the required 
memory to memory crypto engine and crypto_alloc_aead("gcm(aes)", 0, 0); 
decides on using it.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 net-next 6/6] tls: Add generic NIC offload infrastructure.
  2017-12-19 15:38         ` Ilya Lesokhin
@ 2017-12-19 16:18           ` Marcelo Ricardo Leitner
  0 siblings, 0 replies; 35+ messages in thread
From: Marcelo Ricardo Leitner @ 2017-12-19 16:18 UTC (permalink / raw)
  To: Ilya Lesokhin
  Cc: netdev, davem, davejwatson, tom, hannes, Boris Pismenny,
	Aviad Yehezkel, Liran Liss

On Tue, Dec 19, 2017 at 03:38:16PM +0000, Ilya Lesokhin wrote:
> Tuesday, December 19, 2017 5:12 PM, Marcelo Ricardo Leitner wrote:
> 
> > > I'm not quite sure what you mean by "no net_device's are registered"
> > > Presumably you mean there is no device that implements the
> > > NETIF_F_HW_TLS_TX capability yet.
> > 
> > Not really. Let me try again. This patchset is using the expression "tls_device".
> > When I read that, I expect a new interface type, like a tunnel, that would be
> > created on top of another interface that has the offloading capability. That's
> > why I'm confused. IMHO "tls_offload" is a better fit. Makes sense?
> > 
> 
> We don't expose a new interface. An existing netdev does the offload.
> 
> The xfrm layer also calls the offload layer xfrm_device and It also doesn't need to
> add another interface to offload ipsec to a netdev.

Hm right, there is xfrm_dev_init() and others, but there is also
XFRM_OFFLOAD as the config define and not XFRM_DEVICE.

> 
> I thought about calling it tls_hw or tls_hw_offload.
> The problem is that the important distinction here is that the 
> offload is done by a netdev.
> tls_sw can also use hw offload if you have the required 
> memory to memory crypto engine and crypto_alloc_aead("gcm(aes)", 0, 0); 
> decides on using it.

Now I can see the confusion in both ways, thanks.
And now I don't have a preference either.

  Marcelo

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 net-next 3/6] net: Add SW fallback infrastructure for offloaded sockets
  2017-12-18 11:10 ` [PATCH v3 net-next 3/6] net: Add SW fallback infrastructure for offloaded sockets Ilya Lesokhin
  2017-12-18 19:18   ` Marcelo Ricardo Leitner
@ 2017-12-19 19:12   ` Eric Dumazet
  2017-12-19 19:15     ` Ilya Lesokhin
  1 sibling, 1 reply; 35+ messages in thread
From: Eric Dumazet @ 2017-12-19 19:12 UTC (permalink / raw)
  To: Ilya Lesokhin, netdev, davem
  Cc: davejwatson, tom, hannes, borisp, aviadye, liranl

On Mon, 2017-12-18 at 13:10 +0200, Ilya Lesokhin wrote:
> Offloaded sockets rely on the netdev to transform the transmitted
> packets before sending them over the network.
> When a packet from an offloaded socket is looped back or
> rerouted to a different device we need to detect it and
> do the transformation in software
> 
> Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
> Signed-off-by: Boris Pismenny <borisp@mellanox.com>
> ---
>  include/net/sock.h | 17 +++++++++++++++++
>  net/core/dev.c     |  4 ++++
>  2 files changed, 21 insertions(+)
> 
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 9a9047268d37..5397307603ec 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -479,6 +479,9 @@ struct sock {
>  	void			(*sk_error_report)(struct sock *sk);
>  	int			(*sk_backlog_rcv)(struct sock *sk,
>  						  struct sk_buff *skb);
> +	struct sk_buff*		(*sk_offload_check)(struct sock *sk,
> +						    struct net_device *dev,
> +						    struct sk_buff *skb);
>  	void                    (*sk_destruct)(struct sock *sk);
>  	struct sock_reuseport __rcu	*sk_reuseport_cb;
>  	struct rcu_head		sk_rcu;
> @@ -2324,6 +2327,20 @@ static inline bool sk_fullsock(const struct sock *sk)
>  	return (1 << sk->sk_state) & ~(TCPF_TIME_WAIT | TCPF_NEW_SYN_RECV);
>  }
>  
> +/* Checks if this SKB belongs to an HW offloaded socket
> + * and whether any SW fallbacks are required based on dev.
> + */
> +static inline struct sk_buff *skb_offload_check(struct sk_buff *skb,
> +						struct net_device *dev)
> +{
> +	struct sock *sk = skb->sk;
> +
> +	if (sk && sk_fullsock(sk) && sk->sk_offload_check)
> +		skb = sk->sk_offload_check(sk, dev, skb);
> +
> +	return skb;
> +}
> +
>  /* This helper checks if a socket is a LISTEN or NEW_SYN_RECV
>   * SYNACK messages can be attached to either ones (depending on SYNCOOKIE)
>   */
> diff --git a/net/core/dev.c b/net/core/dev.c
> index b0eee49a2489..6a78d9046674 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -3051,6 +3051,10 @@ static struct sk_buff *validate_xmit_skb(struct sk_buff *skb, struct net_device
>  	if (unlikely(!skb))
>  		goto out_null;
>  
> +	skb = skb_offload_check(skb, dev);
> +	if (!skb)
> +		goto out_null;
> +
>  	if (netif_needs_gso(skb, features)) {
>  		struct sk_buff *segs;
>  

How is it going to work with stacked devices (bonding, team, tunnels) ?

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 net-next 1/6] tcp: Add clean acked data hook
  2017-12-18 11:10 ` [PATCH v3 net-next 1/6] tcp: Add clean acked data hook Ilya Lesokhin
@ 2017-12-19 19:13   ` Eric Dumazet
  2017-12-19 19:21     ` Ilya Lesokhin
  0 siblings, 1 reply; 35+ messages in thread
From: Eric Dumazet @ 2017-12-19 19:13 UTC (permalink / raw)
  To: Ilya Lesokhin, netdev, davem
  Cc: davejwatson, tom, hannes, borisp, aviadye, liranl

On Mon, 2017-12-18 at 13:10 +0200, Ilya Lesokhin wrote:
> Called when a TCP segment is acknowledged.
> Could be used by application protocols who hold additional
> metadata associated with the stream data
> This is required by TLS device offload to release
> metadata associated with acknowledged TLS records.
> 
> Signed-off-by: Boris Pismenny <borisp@mellanox.com>
> Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
> Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
> ---
>  include/net/inet_connection_sock.h | 2 ++
>  net/ipv4/tcp_input.c               | 3 +++
>  2 files changed, 5 insertions(+)
> 
> diff --git a/include/net/inet_connection_sock.h b/include/net/inet_connection_sock.h
> index 8e1bf9ae4a5e..ec405a667a85 100644
> --- a/include/net/inet_connection_sock.h
> +++ b/include/net/inet_connection_sock.h
> @@ -77,6 +77,7 @@ struct inet_connection_sock_af_ops {
>   * @icsk_af_ops		   Operations which are AF_INET{4,6} specific
>   * @icsk_ulp_ops	   Pluggable ULP control hook
>   * @icsk_ulp_data	   ULP private data
> + * @icsk_clean_acked	   Clean acked data hook
>   * @icsk_listen_portaddr_node	hash to the portaddr listener hashtable
>   * @icsk_ca_state:	   Congestion control state
>   * @icsk_retransmits:	   Number of unrecovered [RTO] timeouts
> @@ -102,6 +103,7 @@ struct inet_connection_sock {
>  	const struct inet_connection_sock_af_ops *icsk_af_ops;
>  	const struct tcp_ulp_ops  *icsk_ulp_ops;
>  	void			  *icsk_ulp_data;
> +	void			  (*icsk_clean_acked)(struct sock *sk);
>  	struct hlist_node         icsk_listen_portaddr_node;
>  	unsigned int		  (*icsk_sync_mss)(struct sock *sk, u32 pmtu);
>  	__u8			  icsk_ca_state:6,
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 4d55c4b338ee..961abc5be84c 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -3592,6 +3592,9 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag)
>  	if (!prior_packets)
>  		goto no_queue;
>  
> +	if (icsk->icsk_clean_acked)
> +		icsk->icsk_clean_acked(sk);
> +
>  	/* See if we can take anything off of the retransmit queue. */
>  	flag |= tcp_clean_rtx_queue(sk, prior_fack, prior_snd_una, &sack_state);
>  

1) tcp_ack() is already very expensive.

2) Since you do not pass any state here, this looks very suspicious to
me.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH v3 net-next 3/6] net: Add SW fallback infrastructure for offloaded sockets
  2017-12-19 19:12   ` Eric Dumazet
@ 2017-12-19 19:15     ` Ilya Lesokhin
  0 siblings, 0 replies; 35+ messages in thread
From: Ilya Lesokhin @ 2017-12-19 19:15 UTC (permalink / raw)
  To: Eric Dumazet, netdev, davem
  Cc: davejwatson, tom, hannes, Boris Pismenny, Aviad Yehezkel, Liran Liss

> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index b0eee49a2489..6a78d9046674 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -3051,6 +3051,10 @@ static struct sk_buff *validate_xmit_skb(struct
> sk_buff *skb, struct net_device
> >  	if (unlikely(!skb))
> >  		goto out_null;
> >
> > +	skb = skb_offload_check(skb, dev);
> > +	if (!skb)
> > +		goto out_null;
> > +
> >  	if (netif_needs_gso(skb, features)) {
> >  		struct sk_buff *segs;
> >
> 
> How is it going to work with stacked devices (bonding, team, tunnels) ?
> 

Currently we don't support TLS offload on stacked devices.
Note however that in that case the TLS layer will refuse to use device offload
And we will not go through this software offload path.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH v3 net-next 1/6] tcp: Add clean acked data hook
  2017-12-19 19:13   ` Eric Dumazet
@ 2017-12-19 19:21     ` Ilya Lesokhin
  2017-12-19 19:28       ` Eric Dumazet
  0 siblings, 1 reply; 35+ messages in thread
From: Ilya Lesokhin @ 2017-12-19 19:21 UTC (permalink / raw)
  To: Eric Dumazet, netdev, davem
  Cc: davejwatson, tom, hannes, Boris Pismenny, Aviad Yehezkel, Liran Liss

> 1) tcp_ack() is already very expensive.
> 
I'm not sure how what we should do with that comment. We need
Some trigger to free TLS records. tcp_ack seemed like a reasonable
Trigger.

> 2) Since you do not pass any state here, this looks very suspicious to
> me.
> 
The state we need is the acknowledged sequence and it located in the socket.
https://github.com/Mellanox/tls-offload/blob/tls_device_v3/net/tls/tls_device.c#L157


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 net-next 1/6] tcp: Add clean acked data hook
  2017-12-19 19:21     ` Ilya Lesokhin
@ 2017-12-19 19:28       ` Eric Dumazet
  2017-12-19 19:43         ` Ilya Lesokhin
  0 siblings, 1 reply; 35+ messages in thread
From: Eric Dumazet @ 2017-12-19 19:28 UTC (permalink / raw)
  To: Ilya Lesokhin, netdev, davem
  Cc: davejwatson, tom, hannes, Boris Pismenny, Aviad Yehezkel, Liran Liss

On Tue, 2017-12-19 at 19:21 +0000, Ilya Lesokhin wrote:
> > 1) tcp_ack() is already very expensive.
> > 
> 
> I'm not sure how what we should do with that comment. We need
> Some trigger to free TLS records. tcp_ack seemed like a reasonable
> Trigger.

TLS records should be attached to skbs ?

It seems more reasonable to free TLS when skb are freed, and not in
general tcp_ack() path.

> 
> > 2) Since you do not pass any state here, this looks very suspicious to
> > me.
> > 
> 
> The state we need is the acknowledged sequence and it located in the socket.
> https://github.com/Mellanox/tls-offload/blob/tls_device_v3/net/tls/tls_device.c#L157

So it looks like TCP stack is bleeding all over the places ?

So in the future, a change in TCP stack will have to make sure we do
not break net/tls/... compilation.

Not pretty.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH v3 net-next 1/6] tcp: Add clean acked data hook
  2017-12-19 19:28       ` Eric Dumazet
@ 2017-12-19 19:43         ` Ilya Lesokhin
  0 siblings, 0 replies; 35+ messages in thread
From: Ilya Lesokhin @ 2017-12-19 19:43 UTC (permalink / raw)
  To: Eric Dumazet, netdev, davem
  Cc: davejwatson, tom, hannes, Boris Pismenny, Aviad Yehezkel, Liran Liss

> 
> TLS records should be attached to skbs ?
> 
> It seems more reasonable to free TLS when skb are freed, and not in
> general tcp_ack() path.

We've considered it, but then we would have to touch all the places the TCP stack splits or merges SKBs. Seems more intrusive.
> 
> >
> > > 2) Since you do not pass any state here, this looks very suspicious to
> > > me.
> > >
> >
> > The state we need is the acknowledged sequence and it located in the
> socket.
> >> 
> So it looks like TCP stack is bleeding all over the places ?
> 
> So in the future, a change in TCP stack will have to make sure we do
> not break net/tls/... compilation.
> 
> Not pretty.

I guess we could pass the ack sequence number in that callback.
Would that address the issue?

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH v3 net-next 0/6] tls: Add generic NIC offload infrastructure
  2017-12-19 10:30   ` Jiri Pirko
@ 2017-12-20  8:28     ` Boris Pismenny
  2017-12-20 10:08       ` Jiri Pirko
                         ` (2 more replies)
  0 siblings, 3 replies; 35+ messages in thread
From: Boris Pismenny @ 2017-12-20  8:28 UTC (permalink / raw)
  To: Jiri Pirko, Ilya Lesokhin
  Cc: netdev, davem, davejwatson, tom, hannes, Aviad Yehezkel, Liran Liss


> Tue, Dec 19, 2017 at 01:10:10AM CET, jiri@resnulli.us wrote:
> 
> Mon, Dec 18, 2017 at 06:10:10PM CET, jiri@resnulli.us wrote:
> >Mon, Dec 18, 2017 at 12:10:27PM CET, ilyal@mellanox.com wrote:
> >>Changes from v2:
> >>- Fix sk use after free and possible netdev use after free
> >>- tls device now keeps a refernce on the offloading netdev
> >>- tls device registers to the netdev notifer.
> >>  Upon a NETDEV_DOWN event, offload is stopped and
> >>  the reference on the netdev is dropped.
> >>- SW fallback support for skb->ip_summed != CHECKSUM_PARTIAL
> >>- Merged TLS patches are no longer part of this series.
> >>
> >>Changes from v1:
> >>- Remove the binding of the socket to a specific netdev
> >>  through sk->sk_bound_dev_if.
> >>  Add a check in validate_xmit_skb to detect route changes
> >>  and call SW fallback code to do the crypto in software.
> >>- tls_get_record now returns the tls record sequence number.
> >>  This is required to support connections with rcd_sn != iv.
> >>- Bug fixes to the TLS code.
> >>
> >>This patchset adds a generic infrastructure to offload TLS crypto to a
> >>network devices.
> >>
> >>patches 1-2 Export functions that we need patch 3 adds infrastructue
> >>for offloaded socket fallback patches 4-5 add new NDOs and
> >>capabilities.
> >>patch 6 adds the TLS NIC offload infrastructure.
> >>
> >>Github with mlx5e TLS offload support:
> >>https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2F
> git
> >>hub.com%2FMellanox%2Ftls-
> offload%2Ftree%2Ftls_device_v3&data=02%7C01%7
> >>Cborisp%40mellanox.com%7C5aebe81262554f40221908d546cb7c37%7Ca6
> 52971c7d
> >>2e4d9ba6a4d149256f461b%7C0%7C0%7C636492762141202894&sdata=gYY
> DEmspNfBs
> >>aQhefcEojl456L9eWqZnEEI7iPCT0NA%3D&reserved=0
> >
> >I don't get it. You are pushing infra but not the actual driver part
> >who is consuming the infra? Why?
> 
> Okay. Since the driver that uses the API introduced by this patchset is
> missing, this patchset should be marked as RFC.
> 
> Dave, I see that you were about to apply v2. I'm sure you missed this.
> Thanks.

Isn't this a chicken and egg problem, where something must come first,
driver or infra. Unless we combine the infra patches with mlx5 driver
code and submit both in a single pull request.
Here, we assumed that the infra goes first, and we will submit the
driver soon after. We could submit the driver first instead.

Dave, would you prefer to get the driver patches that use this infra before
the infra?

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 net-next 0/6] tls: Add generic NIC offload infrastructure
  2017-12-20  8:28     ` Boris Pismenny
@ 2017-12-20 10:08       ` Jiri Pirko
  2017-12-20 10:15       ` Or Gerlitz
  2017-12-20 16:12       ` David Miller
  2 siblings, 0 replies; 35+ messages in thread
From: Jiri Pirko @ 2017-12-20 10:08 UTC (permalink / raw)
  To: Boris Pismenny
  Cc: Ilya Lesokhin, netdev, davem, davejwatson, tom, hannes,
	Aviad Yehezkel, Liran Liss

Wed, Dec 20, 2017 at 09:28:03AM CET, borisp@mellanox.com wrote:
>
>> Tue, Dec 19, 2017 at 01:10:10AM CET, jiri@resnulli.us wrote:
>> 
>> Mon, Dec 18, 2017 at 06:10:10PM CET, jiri@resnulli.us wrote:
>> >Mon, Dec 18, 2017 at 12:10:27PM CET, ilyal@mellanox.com wrote:
>> >>Changes from v2:
>> >>- Fix sk use after free and possible netdev use after free
>> >>- tls device now keeps a refernce on the offloading netdev
>> >>- tls device registers to the netdev notifer.
>> >>  Upon a NETDEV_DOWN event, offload is stopped and
>> >>  the reference on the netdev is dropped.
>> >>- SW fallback support for skb->ip_summed != CHECKSUM_PARTIAL
>> >>- Merged TLS patches are no longer part of this series.
>> >>
>> >>Changes from v1:
>> >>- Remove the binding of the socket to a specific netdev
>> >>  through sk->sk_bound_dev_if.
>> >>  Add a check in validate_xmit_skb to detect route changes
>> >>  and call SW fallback code to do the crypto in software.
>> >>- tls_get_record now returns the tls record sequence number.
>> >>  This is required to support connections with rcd_sn != iv.
>> >>- Bug fixes to the TLS code.
>> >>
>> >>This patchset adds a generic infrastructure to offload TLS crypto to a
>> >>network devices.
>> >>
>> >>patches 1-2 Export functions that we need patch 3 adds infrastructue
>> >>for offloaded socket fallback patches 4-5 add new NDOs and
>> >>capabilities.
>> >>patch 6 adds the TLS NIC offload infrastructure.
>> >>
>> >>Github with mlx5e TLS offload support:
>> >>https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2F
>> git
>> >>hub.com%2FMellanox%2Ftls-
>> offload%2Ftree%2Ftls_device_v3&data=02%7C01%7
>> >>Cborisp%40mellanox.com%7C5aebe81262554f40221908d546cb7c37%7Ca6
>> 52971c7d
>> >>2e4d9ba6a4d149256f461b%7C0%7C0%7C636492762141202894&sdata=gYY
>> DEmspNfBs
>> >>aQhefcEojl456L9eWqZnEEI7iPCT0NA%3D&reserved=0
>> >
>> >I don't get it. You are pushing infra but not the actual driver part
>> >who is consuming the infra? Why?
>> 
>> Okay. Since the driver that uses the API introduced by this patchset is
>> missing, this patchset should be marked as RFC.
>> 
>> Dave, I see that you were about to apply v2. I'm sure you missed this.
>> Thanks.
>
>Isn't this a chicken and egg problem, where something must come first,
>driver or infra. Unless we combine the infra patches with mlx5 driver
>code and submit both in a single pull request.

Yes, you should submit that in a single patchset. That is the usual way.
Thanks.


>Here, we assumed that the infra goes first, and we will submit the
>driver soon after. We could submit the driver first instead.

No. You cannot do that like this.


>
>Dave, would you prefer to get the driver patches that use this infra before
>the infra?
>
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 net-next 0/6] tls: Add generic NIC offload infrastructure
  2017-12-20  8:28     ` Boris Pismenny
  2017-12-20 10:08       ` Jiri Pirko
@ 2017-12-20 10:15       ` Or Gerlitz
  2017-12-20 10:31         ` Or Gerlitz
  2017-12-20 16:12       ` David Miller
  2 siblings, 1 reply; 35+ messages in thread
From: Or Gerlitz @ 2017-12-20 10:15 UTC (permalink / raw)
  To: Boris Pismenny
  Cc: Jiri Pirko, Ilya Lesokhin, netdev, davem, davejwatson, tom,
	hannes, Aviad Yehezkel, Liran Liss

On Wed, Dec 20, 2017 at 10:28 AM, Boris Pismenny <borisp@mellanox.com> wrote:

> Isn't this a chicken and egg problem, where something must come first,
> driver or infra. Unless we combine the infra patches with mlx5 driver
> code and submit both in a single pull request.

why chicken and egg? you do the infra changes and apply them on the driver.

> Here, we assumed that the infra goes first, and we will submit the
> driver soon after. We could submit the driver first instead.

NOo Boris,

It is fundamental requirement to show a use-case along with infra-structure.

Or.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 net-next 0/6] tls: Add generic NIC offload infrastructure
  2017-12-20 10:15       ` Or Gerlitz
@ 2017-12-20 10:31         ` Or Gerlitz
  0 siblings, 0 replies; 35+ messages in thread
From: Or Gerlitz @ 2017-12-20 10:31 UTC (permalink / raw)
  To: Boris Pismenny
  Cc: Jiri Pirko, Ilya Lesokhin, netdev, davem, davejwatson, tom,
	hannes, Aviad Yehezkel, Liran Liss

> It is fundamental requirement to show a use-case along with infra-structure.

To make it clear, sometimes you don't have yet the full HW/FW/FPGA solution
when you come to upstream the infra code. In that case you do the driver code
and submit it based on HW emulator run, this is common the netdev community
and done by few vendors among them us.

Or.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 net-next 0/6] tls: Add generic NIC offload infrastructure
  2017-12-20  8:28     ` Boris Pismenny
  2017-12-20 10:08       ` Jiri Pirko
  2017-12-20 10:15       ` Or Gerlitz
@ 2017-12-20 16:12       ` David Miller
  2017-12-20 16:23         ` Ilya Lesokhin
  2 siblings, 1 reply; 35+ messages in thread
From: David Miller @ 2017-12-20 16:12 UTC (permalink / raw)
  To: borisp; +Cc: jiri, ilyal, netdev, davejwatson, tom, hannes, aviadye, liranl

From: Boris Pismenny <borisp@mellanox.com>
Date: Wed, 20 Dec 2017 08:28:03 +0000

> Dave, would you prefer to get the driver patches that use this infra
> before the infra?

The arguments you present are silly.

In order to analyze any proposed API, the users of it must be presented
for the reviewers to see as well.

Logically, you must have tried to make use of the APIs to see how well
they work and are usable for at least one such user, right?

Therefore, the use case exists, and you must present it alongside the
API proposal.

Whether you provide the API addition patches and the user in the same
patch series, or a separate one, doesn't really matter.  What is
important is that this is accessible to the reviewer at the same
time.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH v3 net-next 0/6] tls: Add generic NIC offload infrastructure
  2017-12-20 16:12       ` David Miller
@ 2017-12-20 16:23         ` Ilya Lesokhin
  2017-12-20 16:36           ` David Miller
  0 siblings, 1 reply; 35+ messages in thread
From: Ilya Lesokhin @ 2017-12-20 16:23 UTC (permalink / raw)
  To: David Miller, Boris Pismenny
  Cc: jiri, netdev, davejwatson, tom, hannes, Aviad Yehezkel, Liran Liss

> 
> > Dave, would you prefer to get the driver patches that use this infra
> > before the infra?
> 
> The arguments you present are silly.
> 
> In order to analyze any proposed API, the users of it must be presented for the
> reviewers to see as well.
> 
> Logically, you must have tried to make use of the APIs to see how well they
> work and are usable for at least one such user, right?
Right, we agree.
> 
> Therefore, the use case exists, and you must present it alongside the API
> proposal.
> 
> Whether you provide the API addition patches and the user in the same patch
> series, or a separate one, doesn't really matter.  What is important is that this
> is accessible to the reviewer at the same time.

Note that we did provide a user in an accessible place.
https://github.com/Mellanox/tls-offload/tree/tls_device_v3
The link was at the bottom of the cover letter.

We just feel that the code there is not yet ready for upstream submission, and it might have
conflicts with other stuff submitted by Mellanox.

Would it be better if we submitted the mlx5e TLS support as an RFC alongside the TLS
Infrastructure patches?

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 net-next 0/6] tls: Add generic NIC offload infrastructure
  2017-12-20 16:23         ` Ilya Lesokhin
@ 2017-12-20 16:36           ` David Miller
  2017-12-20 19:12             ` Jiri Pirko
  0 siblings, 1 reply; 35+ messages in thread
From: David Miller @ 2017-12-20 16:36 UTC (permalink / raw)
  To: ilyal; +Cc: borisp, jiri, netdev, davejwatson, tom, hannes, aviadye, liranl

From: Ilya Lesokhin <ilyal@mellanox.com>
Date: Wed, 20 Dec 2017 16:23:03 +0000

>> Whether you provide the API addition patches and the user in the same patch
>> series, or a separate one, doesn't really matter.  What is important is that this
>> is accessible to the reviewer at the same time.
> 
> Note that we did provide a user in an accessible place.

That is not accessible for people reading netdev, it needs to be posted
on the netdev list.

It is never appropriate to require a reviewer to look at some external
site to review a patch series posted here.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 net-next 0/6] tls: Add generic NIC offload infrastructure
  2017-12-20 16:36           ` David Miller
@ 2017-12-20 19:12             ` Jiri Pirko
  0 siblings, 0 replies; 35+ messages in thread
From: Jiri Pirko @ 2017-12-20 19:12 UTC (permalink / raw)
  To: David Miller
  Cc: ilyal, borisp, netdev, davejwatson, tom, hannes, aviadye, liranl

Wed, Dec 20, 2017 at 05:36:14PM CET, davem@davemloft.net wrote:
>From: Ilya Lesokhin <ilyal@mellanox.com>
>Date: Wed, 20 Dec 2017 16:23:03 +0000
>
>>> Whether you provide the API addition patches and the user in the same patch
>>> series, or a separate one, doesn't really matter.  What is important is that this
>>> is accessible to the reviewer at the same time.
>> 
>> Note that we did provide a user in an accessible place.
>
>That is not accessible for people reading netdev, it needs to be posted
>on the netdev list.
>
>It is never appropriate to require a reviewer to look at some external
>site to review a patch series posted here.

Yeah, that is why I think it is really best to have the infra and the
driver that uses it in a single patchset. Easier to understand what is
going on :)

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2017-12-20 19:12 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-18 11:10 [PATCH v3 net-next 0/6] tls: Add generic NIC offload infrastructure Ilya Lesokhin
2017-12-18 11:10 ` [PATCH v3 net-next 1/6] tcp: Add clean acked data hook Ilya Lesokhin
2017-12-19 19:13   ` Eric Dumazet
2017-12-19 19:21     ` Ilya Lesokhin
2017-12-19 19:28       ` Eric Dumazet
2017-12-19 19:43         ` Ilya Lesokhin
2017-12-18 11:10 ` [PATCH v3 net-next 2/6] net: Rename and export copy_skb_header Ilya Lesokhin
2017-12-18 11:10 ` [PATCH v3 net-next 3/6] net: Add SW fallback infrastructure for offloaded sockets Ilya Lesokhin
2017-12-18 19:18   ` Marcelo Ricardo Leitner
2017-12-19  7:51     ` Ilya Lesokhin
2017-12-19 15:05       ` Marcelo Ricardo Leitner
2017-12-19 19:12   ` Eric Dumazet
2017-12-19 19:15     ` Ilya Lesokhin
2017-12-18 11:10 ` [PATCH v3 net-next 4/6] net: Add TLS offload netdev ops Ilya Lesokhin
2017-12-18 11:10 ` [PATCH v3 net-next 5/6] net: Add TLS TX offload features Ilya Lesokhin
2017-12-18 11:10 ` [PATCH v3 net-next 6/6] tls: Add generic NIC offload infrastructure Ilya Lesokhin
2017-12-18 19:53   ` Marcelo Ricardo Leitner
2017-12-19  7:31     ` Ilya Lesokhin
2017-12-19 15:11       ` Marcelo Ricardo Leitner
2017-12-19 15:38         ` Ilya Lesokhin
2017-12-19 16:18           ` Marcelo Ricardo Leitner
2017-12-19  7:00   ` kbuild test robot
2017-12-19  7:01   ` kbuild test robot
2017-12-19  8:17   ` [RFC PATCH] tls: tls_sw_fallback() can be static kbuild test robot
2017-12-19  8:17   ` [PATCH v3 net-next 6/6] tls: Add generic NIC offload infrastructure kbuild test robot
2017-12-18 17:10 ` [PATCH v3 net-next 0/6] " Jiri Pirko
2017-12-19 10:30   ` Jiri Pirko
2017-12-20  8:28     ` Boris Pismenny
2017-12-20 10:08       ` Jiri Pirko
2017-12-20 10:15       ` Or Gerlitz
2017-12-20 10:31         ` Or Gerlitz
2017-12-20 16:12       ` David Miller
2017-12-20 16:23         ` Ilya Lesokhin
2017-12-20 16:36           ` David Miller
2017-12-20 19:12             ` Jiri Pirko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.