netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 00/12] tls: add support for kernel-driven resync and nfp RX offload
@ 2019-06-11  4:39 Jakub Kicinski
  2019-06-11  4:39 ` [PATCH net-next 01/12] net/tls: simplify seq calculation in handle_device_resync() Jakub Kicinski
                   ` (12 more replies)
  0 siblings, 13 replies; 14+ messages in thread
From: Jakub Kicinski @ 2019-06-11  4:39 UTC (permalink / raw)
  To: davem
  Cc: netdev, oss-drivers, alexei.starovoitov, davejwatson, borisp,
	Jakub Kicinski

Hi!

This series adds TLS RX offload for NFP and completes the offload
by providing resync strategies.  When TLS data stream looses segments
or experiences reorder NIC can no longer perform in line offload.
Resyncs provide information about placement of records in the
stream so that offload can resume.

Existing TLS resync mechanisms are not a great fit for the NFP.
In particular the TX resync is hard to implement for packet-centric
NICs.  This patchset adds an ability to perform TX resync in a way
similar to the way initial sync is done - by calling down to the
driver when new record is created after driver indicated sync had
been lost.

Similarly on the RX side, we try to wait for a gap in the stream
and send record information for the next record.  This works very
well for RPC workloads which are the primary focus at this time.

Dirk van der Merwe (2):
  nfp: tls: set skb decrypted flag
  nfp: tls: implement RX TLS resync

Jakub Kicinski (10):
  net/tls: simplify seq calculation in handle_device_resync()
  net/tls: pass record number as a byte array
  net/tls: rename handle_device_resync()
  net/tls: add kernel-driven TLS RX resync
  nfp: rename nfp_ccm_mbox_alloc()
  nfp: add async version of mailbox communication
  nfp: tls: enable TLS RX offload
  net/tls: generalize the resync callback
  net/tls: add kernel-driven resync mechanism for TX
  nfp: tls: make use of kernel-driven TX resync

 Documentation/networking/tls-offload.rst      |  54 +++++-
 .../mellanox/mlx5/core/en_accel/tls.c         |  10 +-
 drivers/net/ethernet/netronome/nfp/ccm.h      |  10 +-
 drivers/net/ethernet/netronome/nfp/ccm_mbox.c | 179 ++++++++++++++++--
 .../ethernet/netronome/nfp/crypto/crypto.h    |   6 +-
 .../net/ethernet/netronome/nfp/crypto/tls.c   |  73 ++++++-
 drivers/net/ethernet/netronome/nfp/nfp_net.h  |  20 +-
 .../ethernet/netronome/nfp/nfp_net_common.c   |  57 +++++-
 .../ethernet/netronome/nfp/nfp_net_ethtool.c  |  18 +-
 include/net/tls.h                             |  63 +++++-
 net/tls/tls_device.c                          | 140 ++++++++++++--
 net/tls/tls_sw.c                              |   9 +-
 12 files changed, 566 insertions(+), 73 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH net-next 01/12] net/tls: simplify seq calculation in handle_device_resync()
  2019-06-11  4:39 [PATCH net-next 00/12] tls: add support for kernel-driven resync and nfp RX offload Jakub Kicinski
@ 2019-06-11  4:39 ` Jakub Kicinski
  2019-06-11  4:40 ` [PATCH net-next 02/12] net/tls: pass record number as a byte array Jakub Kicinski
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Jakub Kicinski @ 2019-06-11  4:39 UTC (permalink / raw)
  To: davem
  Cc: netdev, oss-drivers, alexei.starovoitov, davejwatson, borisp,
	Jakub Kicinski, Dirk van der Merwe

We subtract "TLS_HEADER_SIZE - 1" from req_seq, then if they
match we add the same constant to seq.  Just add it to seq,
and we don't have to touch req_seq.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
---
 net/tls/tls_device.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
index 43f2deb57078..59f0c8dacbcc 100644
--- a/net/tls/tls_device.c
+++ b/net/tls/tls_device.c
@@ -576,14 +576,13 @@ void handle_device_resync(struct sock *sk, u32 seq, u64 rcd_sn)
 
 	rx_ctx = tls_offload_ctx_rx(tls_ctx);
 	resync_req = atomic64_read(&rx_ctx->resync_req);
-	req_seq = (resync_req >> 32) - ((u32)TLS_HEADER_SIZE - 1);
+	req_seq = resync_req >> 32;
+	seq += TLS_HEADER_SIZE - 1;
 	is_req_pending = resync_req;
 
 	if (unlikely(is_req_pending) && req_seq == seq &&
-	    atomic64_try_cmpxchg(&rx_ctx->resync_req, &resync_req, 0)) {
-		seq += TLS_HEADER_SIZE - 1;
+	    atomic64_try_cmpxchg(&rx_ctx->resync_req, &resync_req, 0))
 		tls_device_resync_rx(tls_ctx, sk, seq, rcd_sn);
-	}
 }
 
 static int tls_device_reencrypt(struct sock *sk, struct sk_buff *skb)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net-next 02/12] net/tls: pass record number as a byte array
  2019-06-11  4:39 [PATCH net-next 00/12] tls: add support for kernel-driven resync and nfp RX offload Jakub Kicinski
  2019-06-11  4:39 ` [PATCH net-next 01/12] net/tls: simplify seq calculation in handle_device_resync() Jakub Kicinski
@ 2019-06-11  4:40 ` Jakub Kicinski
  2019-06-11  4:40 ` [PATCH net-next 03/12] net/tls: rename handle_device_resync() Jakub Kicinski
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Jakub Kicinski @ 2019-06-11  4:40 UTC (permalink / raw)
  To: davem
  Cc: netdev, oss-drivers, alexei.starovoitov, davejwatson, borisp,
	Jakub Kicinski, Dirk van der Merwe

TLS offload code casts record number to a u64.  The buffer
should be aligned to 8 bytes, but its actually a __be64, and
the rest of the TLS code treats it as big int.  Make the
offload callbacks take a byte array, drivers can make the
choice to do the ugly cast if they want to.

Prepare for copying the record number onto the stack by
defining a constant for max size of the byte array.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
---
 .../net/ethernet/mellanox/mlx5/core/en_accel/tls.c   |  3 ++-
 include/net/tls.h                                    |  5 +++--
 net/tls/tls_device.c                                 | 12 +++++++++---
 net/tls/tls_sw.c                                     |  8 ++++----
 4 files changed, 18 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls.c
index e88340e196f7..d65150aa8298 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls.c
@@ -161,11 +161,12 @@ static void mlx5e_tls_del(struct net_device *netdev,
 }
 
 static void mlx5e_tls_resync_rx(struct net_device *netdev, struct sock *sk,
-				u32 seq, u64 rcd_sn)
+				u32 seq, u8 *rcd_sn_data)
 {
 	struct tls_context *tls_ctx = tls_get_ctx(sk);
 	struct mlx5e_priv *priv = netdev_priv(netdev);
 	struct mlx5e_tls_offload_context_rx *rx_ctx;
+	u64 rcd_sn = *(u64 *)rcd_sn_data;
 
 	rx_ctx = mlx5e_get_tls_rx_context(tls_ctx);
 
diff --git a/include/net/tls.h b/include/net/tls.h
index 3ecf45adb707..25641e2f5b96 100644
--- a/include/net/tls.h
+++ b/include/net/tls.h
@@ -62,6 +62,7 @@
 #define TLS_DEVICE_NAME_MAX		32
 
 #define MAX_IV_SIZE			16
+#define TLS_MAX_REC_SEQ_SIZE		8
 
 /* For AES-CCM, the full 16-bytes of IV is made of '4' fields of given sizes.
  *
@@ -299,7 +300,7 @@ struct tlsdev_ops {
 			    struct tls_context *ctx,
 			    enum tls_offload_ctx_dir direction);
 	void (*tls_dev_resync_rx)(struct net_device *netdev,
-				  struct sock *sk, u32 seq, u64 rcd_sn);
+				  struct sock *sk, u32 seq, u8 *rcd_sn);
 };
 
 struct tls_offload_context_rx {
@@ -607,6 +608,6 @@ int tls_sw_fallback_init(struct sock *sk,
 int tls_set_device_offload_rx(struct sock *sk, struct tls_context *ctx);
 
 void tls_device_offload_cleanup_rx(struct sock *sk);
-void handle_device_resync(struct sock *sk, u32 seq, u64 rcd_sn);
+void handle_device_resync(struct sock *sk, u32 seq);
 
 #endif /* _TLS_OFFLOAD_H */
diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
index 59f0c8dacbcc..16635f0c829c 100644
--- a/net/tls/tls_device.c
+++ b/net/tls/tls_device.c
@@ -551,7 +551,7 @@ void tls_device_write_space(struct sock *sk, struct tls_context *ctx)
 }
 
 static void tls_device_resync_rx(struct tls_context *tls_ctx,
-				 struct sock *sk, u32 seq, u64 rcd_sn)
+				 struct sock *sk, u32 seq, u8 *rcd_sn)
 {
 	struct net_device *netdev;
 
@@ -563,7 +563,7 @@ static void tls_device_resync_rx(struct tls_context *tls_ctx,
 	clear_bit_unlock(TLS_RX_SYNC_RUNNING, &tls_ctx->flags);
 }
 
-void handle_device_resync(struct sock *sk, u32 seq, u64 rcd_sn)
+void handle_device_resync(struct sock *sk, u32 seq)
 {
 	struct tls_context *tls_ctx = tls_get_ctx(sk);
 	struct tls_offload_context_rx *rx_ctx;
@@ -582,7 +582,7 @@ void handle_device_resync(struct sock *sk, u32 seq, u64 rcd_sn)
 
 	if (unlikely(is_req_pending) && req_seq == seq &&
 	    atomic64_try_cmpxchg(&rx_ctx->resync_req, &resync_req, 0))
-		tls_device_resync_rx(tls_ctx, sk, seq, rcd_sn);
+		tls_device_resync_rx(tls_ctx, sk, seq, tls_ctx->rx.rec_seq);
 }
 
 static int tls_device_reencrypt(struct sock *sk, struct sk_buff *skb)
@@ -760,6 +760,12 @@ int tls_set_device_offload(struct sock *sk, struct tls_context *ctx)
 		goto free_offload_ctx;
 	}
 
+	/* Sanity-check the rec_seq_size for stack allocations */
+	if (rec_seq_size > TLS_MAX_REC_SEQ_SIZE) {
+		rc = -EINVAL;
+		goto free_offload_ctx;
+	}
+
 	prot->prepend_size = TLS_HEADER_SIZE + nonce_size;
 	prot->tag_size = tag_size;
 	prot->overhead_size = prot->prepend_size + prot->tag_size;
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index bef71e54fad0..c1d22290f1d0 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -2015,8 +2015,7 @@ static int tls_read_size(struct strparser *strp, struct sk_buff *skb)
 		goto read_failure;
 	}
 #ifdef CONFIG_TLS_DEVICE
-	handle_device_resync(strp->sk, TCP_SKB_CB(skb)->seq + rxm->offset,
-			     *(u64*)tls_ctx->rx.rec_seq);
+	handle_device_resync(strp->sk, TCP_SKB_CB(skb)->seq + rxm->offset);
 #endif
 	return data_len + TLS_HEADER_SIZE;
 
@@ -2283,8 +2282,9 @@ int tls_set_sw_offload(struct sock *sk, struct tls_context *ctx, int tx)
 		goto free_priv;
 	}
 
-	/* Sanity-check the IV size for stack allocations. */
-	if (iv_size > MAX_IV_SIZE || nonce_size > MAX_IV_SIZE) {
+	/* Sanity-check the sizes for stack allocations. */
+	if (iv_size > MAX_IV_SIZE || nonce_size > MAX_IV_SIZE ||
+	    rec_seq_size > TLS_MAX_REC_SEQ_SIZE) {
 		rc = -EINVAL;
 		goto free_priv;
 	}
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net-next 03/12] net/tls: rename handle_device_resync()
  2019-06-11  4:39 [PATCH net-next 00/12] tls: add support for kernel-driven resync and nfp RX offload Jakub Kicinski
  2019-06-11  4:39 ` [PATCH net-next 01/12] net/tls: simplify seq calculation in handle_device_resync() Jakub Kicinski
  2019-06-11  4:40 ` [PATCH net-next 02/12] net/tls: pass record number as a byte array Jakub Kicinski
@ 2019-06-11  4:40 ` Jakub Kicinski
  2019-06-11  4:40 ` [PATCH net-next 04/12] net/tls: add kernel-driven TLS RX resync Jakub Kicinski
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Jakub Kicinski @ 2019-06-11  4:40 UTC (permalink / raw)
  To: davem
  Cc: netdev, oss-drivers, alexei.starovoitov, davejwatson, borisp,
	Jakub Kicinski, Dirk van der Merwe

handle_device_resync() doesn't describe the function very well.
The function checks if resync should be issued upon parsing of
a new record.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
---
 include/net/tls.h    | 2 +-
 net/tls/tls_device.c | 2 +-
 net/tls/tls_sw.c     | 3 ++-
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/net/tls.h b/include/net/tls.h
index 25641e2f5b96..1c512da5e4f4 100644
--- a/include/net/tls.h
+++ b/include/net/tls.h
@@ -608,6 +608,6 @@ int tls_sw_fallback_init(struct sock *sk,
 int tls_set_device_offload_rx(struct sock *sk, struct tls_context *ctx);
 
 void tls_device_offload_cleanup_rx(struct sock *sk);
-void handle_device_resync(struct sock *sk, u32 seq);
+void tls_device_rx_resync_new_rec(struct sock *sk, u32 seq);
 
 #endif /* _TLS_OFFLOAD_H */
diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
index 16635f0c829c..0ecfa0ee415d 100644
--- a/net/tls/tls_device.c
+++ b/net/tls/tls_device.c
@@ -563,7 +563,7 @@ static void tls_device_resync_rx(struct tls_context *tls_ctx,
 	clear_bit_unlock(TLS_RX_SYNC_RUNNING, &tls_ctx->flags);
 }
 
-void handle_device_resync(struct sock *sk, u32 seq)
+void tls_device_rx_resync_new_rec(struct sock *sk, u32 seq)
 {
 	struct tls_context *tls_ctx = tls_get_ctx(sk);
 	struct tls_offload_context_rx *rx_ctx;
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index c1d22290f1d0..bc3a1b188d4a 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -2015,7 +2015,8 @@ static int tls_read_size(struct strparser *strp, struct sk_buff *skb)
 		goto read_failure;
 	}
 #ifdef CONFIG_TLS_DEVICE
-	handle_device_resync(strp->sk, TCP_SKB_CB(skb)->seq + rxm->offset);
+	tls_device_rx_resync_new_rec(strp->sk,
+				     TCP_SKB_CB(skb)->seq + rxm->offset);
 #endif
 	return data_len + TLS_HEADER_SIZE;
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net-next 04/12] net/tls: add kernel-driven TLS RX resync
  2019-06-11  4:39 [PATCH net-next 00/12] tls: add support for kernel-driven resync and nfp RX offload Jakub Kicinski
                   ` (2 preceding siblings ...)
  2019-06-11  4:40 ` [PATCH net-next 03/12] net/tls: rename handle_device_resync() Jakub Kicinski
@ 2019-06-11  4:40 ` Jakub Kicinski
  2019-06-11  4:40 ` [PATCH net-next 05/12] nfp: tls: set skb decrypted flag Jakub Kicinski
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Jakub Kicinski @ 2019-06-11  4:40 UTC (permalink / raw)
  To: davem
  Cc: netdev, oss-drivers, alexei.starovoitov, davejwatson, borisp,
	Jakub Kicinski, Dirk van der Merwe

TLS offload device may lose sync with the TCP stream if packets
arrive out of order.  Drivers can currently request a resync at
a specific TCP sequence number.  When a record is found starting
at that sequence number kernel will inform the device of the
corresponding record number.

This requires the device to constantly scan the stream for a
known pattern (constant bytes of the header) after sync is lost.

This patch adds an alternative approach which is entirely under
the control of the kernel.  Kernel tracks records it had to fully
decrypt, even though TLS socket is in TLS_HW mode.  If multiple
records did not have any decrypted parts - it's a pretty strong
indication that the device is out of sync.

We choose the min number of fully encrypted records to be 2,
which should hopefully be more than will get retransmitted at
a time.

After kernel decides the device is out of sync it schedules a
resync request.  If the TCP socket is empty the resync gets
performed immediately.  If socket is not empty we leave the
record parser to resync when next record comes.

Before resync in message parser we peek at the TCP socket and
don't attempt the sync if the socket already has some of the
next record queued.

On resync failure (encrypted data continues to flow in) we
retry with exponential backoff, up to once every 128 records
(with a 16k record thats at most once every 2M of data).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
---
 Documentation/networking/tls-offload.rst |  19 ++++
 include/net/tls.h                        |  34 +++++++-
 net/tls/tls_device.c                     | 105 ++++++++++++++++++++---
 net/tls/tls_sw.c                         |   2 +-
 4 files changed, 145 insertions(+), 15 deletions(-)

diff --git a/Documentation/networking/tls-offload.rst b/Documentation/networking/tls-offload.rst
index eb7c9b81ccf5..d134d63307e7 100644
--- a/Documentation/networking/tls-offload.rst
+++ b/Documentation/networking/tls-offload.rst
@@ -268,6 +268,9 @@ Device can only detect that segment 4 also contains a TLS header
 if it knows the length of the previous record from segment 2. In this case
 the device will lose synchronization with the stream.
 
+Stream scan resynchronization
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
 When the device gets out of sync and the stream reaches TCP sequence
 numbers more than a max size record past the expected TCP sequence number,
 the device starts scanning for a known header pattern. For example
@@ -298,6 +301,22 @@ Special care has to be taken if the confirmation request is passed
 asynchronously to the packet stream and record may get processed
 by the kernel before the confirmation request.
 
+Stack-driven resynchronization
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The driver may also request the stack to perform resynchronization
+whenever it sees the records are no longer getting decrypted.
+If the connection is configured in this mode the stack automatically
+schedules resynchronization after it has received two completely encrypted
+records.
+
+The stack waits for the socket to drain and informs the device about
+the next expected record number and its TCP sequence number. If the
+records continue to be received fully encrypted stack retries the
+synchronization with an exponential back off (first after 2 encrypted
+records, then after 4 records, after 8, after 16... up until every
+128 records).
+
 Error handling
 ==============
 
diff --git a/include/net/tls.h b/include/net/tls.h
index 1c512da5e4f4..28eca6a3b615 100644
--- a/include/net/tls.h
+++ b/include/net/tls.h
@@ -303,10 +303,33 @@ struct tlsdev_ops {
 				  struct sock *sk, u32 seq, u8 *rcd_sn);
 };
 
+enum tls_offload_sync_type {
+	TLS_OFFLOAD_SYNC_TYPE_DRIVER_REQ = 0,
+	TLS_OFFLOAD_SYNC_TYPE_CORE_NEXT_HINT = 1,
+};
+
+#define TLS_DEVICE_RESYNC_NH_START_IVAL		2
+#define TLS_DEVICE_RESYNC_NH_MAX_IVAL		128
+
 struct tls_offload_context_rx {
 	/* sw must be the first member of tls_offload_context_rx */
 	struct tls_sw_context_rx sw;
-	atomic64_t resync_req;
+	enum tls_offload_sync_type resync_type;
+	/* this member is set regardless of resync_type, to avoid branches */
+	u8 resync_nh_reset:1;
+	/* CORE_NEXT_HINT-only member, but use the hole here */
+	u8 resync_nh_do_now:1;
+	union {
+		/* TLS_OFFLOAD_SYNC_TYPE_DRIVER_REQ */
+		struct {
+			atomic64_t resync_req;
+		};
+		/* TLS_OFFLOAD_SYNC_TYPE_CORE_NEXT_HINT */
+		struct {
+			u32 decrypted_failed;
+			u32 decrypted_tgt;
+		} resync_nh;
+	};
 	u8 driver_state[] __aligned(8);
 	/* The TLS layer reserves room for driver specific state
 	 * Currently the belief is that there is not enough
@@ -587,6 +610,13 @@ static inline void tls_offload_rx_resync_request(struct sock *sk, __be32 seq)
 	atomic64_set(&rx_ctx->resync_req, ((u64)ntohl(seq) << 32) | 1);
 }
 
+static inline void
+tls_offload_rx_resync_set_type(struct sock *sk, enum tls_offload_sync_type type)
+{
+	struct tls_context *tls_ctx = tls_get_ctx(sk);
+
+	tls_offload_ctx_rx(tls_ctx)->resync_type = type;
+}
 
 int tls_proccess_cmsg(struct sock *sk, struct msghdr *msg,
 		      unsigned char *record_type);
@@ -608,6 +638,6 @@ int tls_sw_fallback_init(struct sock *sk,
 int tls_set_device_offload_rx(struct sock *sk, struct tls_context *ctx);
 
 void tls_device_offload_cleanup_rx(struct sock *sk);
-void tls_device_rx_resync_new_rec(struct sock *sk, u32 seq);
+void tls_device_rx_resync_new_rec(struct sock *sk, u32 rcd_len, u32 seq);
 
 #endif /* _TLS_OFFLOAD_H */
diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
index 0ecfa0ee415d..477c869c69c8 100644
--- a/net/tls/tls_device.c
+++ b/net/tls/tls_device.c
@@ -563,10 +563,12 @@ static void tls_device_resync_rx(struct tls_context *tls_ctx,
 	clear_bit_unlock(TLS_RX_SYNC_RUNNING, &tls_ctx->flags);
 }
 
-void tls_device_rx_resync_new_rec(struct sock *sk, u32 seq)
+void tls_device_rx_resync_new_rec(struct sock *sk, u32 rcd_len, u32 seq)
 {
 	struct tls_context *tls_ctx = tls_get_ctx(sk);
 	struct tls_offload_context_rx *rx_ctx;
+	u8 rcd_sn[TLS_MAX_REC_SEQ_SIZE];
+	struct tls_prot_info *prot;
 	u32 is_req_pending;
 	s64 resync_req;
 	u32 req_seq;
@@ -574,15 +576,84 @@ void tls_device_rx_resync_new_rec(struct sock *sk, u32 seq)
 	if (tls_ctx->rx_conf != TLS_HW)
 		return;
 
+	prot = &tls_ctx->prot_info;
 	rx_ctx = tls_offload_ctx_rx(tls_ctx);
-	resync_req = atomic64_read(&rx_ctx->resync_req);
-	req_seq = resync_req >> 32;
-	seq += TLS_HEADER_SIZE - 1;
-	is_req_pending = resync_req;
-
-	if (unlikely(is_req_pending) && req_seq == seq &&
-	    atomic64_try_cmpxchg(&rx_ctx->resync_req, &resync_req, 0))
-		tls_device_resync_rx(tls_ctx, sk, seq, tls_ctx->rx.rec_seq);
+	memcpy(rcd_sn, tls_ctx->rx.rec_seq, prot->rec_seq_size);
+
+	switch (rx_ctx->resync_type) {
+	case TLS_OFFLOAD_SYNC_TYPE_DRIVER_REQ:
+		resync_req = atomic64_read(&rx_ctx->resync_req);
+		req_seq = resync_req >> 32;
+		seq += TLS_HEADER_SIZE - 1;
+		is_req_pending = resync_req;
+
+		if (likely(!is_req_pending) || req_seq != seq ||
+		    !atomic64_try_cmpxchg(&rx_ctx->resync_req, &resync_req, 0))
+			return;
+		break;
+	case TLS_OFFLOAD_SYNC_TYPE_CORE_NEXT_HINT:
+		if (likely(!rx_ctx->resync_nh_do_now))
+			return;
+
+		/* head of next rec is already in, note that the sock_inq will
+		 * include the currently parsed message when called from parser
+		 */
+		if (tcp_inq(sk) > rcd_len)
+			return;
+
+		rx_ctx->resync_nh_do_now = 0;
+		seq += rcd_len;
+		tls_bigint_increment(rcd_sn, prot->rec_seq_size);
+		break;
+	}
+
+	tls_device_resync_rx(tls_ctx, sk, seq, rcd_sn);
+}
+
+static void tls_device_core_ctrl_rx_resync(struct tls_context *tls_ctx,
+					   struct tls_offload_context_rx *ctx,
+					   struct sock *sk, struct sk_buff *skb)
+{
+	struct strp_msg *rxm;
+
+	/* device will request resyncs by itself based on stream scan */
+	if (ctx->resync_type != TLS_OFFLOAD_SYNC_TYPE_CORE_NEXT_HINT)
+		return;
+	/* already scheduled */
+	if (ctx->resync_nh_do_now)
+		return;
+	/* seen decrypted fragments since last fully-failed record */
+	if (ctx->resync_nh_reset) {
+		ctx->resync_nh_reset = 0;
+		ctx->resync_nh.decrypted_failed = 1;
+		ctx->resync_nh.decrypted_tgt = TLS_DEVICE_RESYNC_NH_START_IVAL;
+		return;
+	}
+
+	if (++ctx->resync_nh.decrypted_failed <= ctx->resync_nh.decrypted_tgt)
+		return;
+
+	/* doing resync, bump the next target in case it fails */
+	if (ctx->resync_nh.decrypted_tgt < TLS_DEVICE_RESYNC_NH_MAX_IVAL)
+		ctx->resync_nh.decrypted_tgt *= 2;
+	else
+		ctx->resync_nh.decrypted_tgt += TLS_DEVICE_RESYNC_NH_MAX_IVAL;
+
+	rxm = strp_msg(skb);
+
+	/* head of next rec is already in, parser will sync for us */
+	if (tcp_inq(sk) > rxm->full_len) {
+		ctx->resync_nh_do_now = 1;
+	} else {
+		struct tls_prot_info *prot = &tls_ctx->prot_info;
+		u8 rcd_sn[TLS_MAX_REC_SEQ_SIZE];
+
+		memcpy(rcd_sn, tls_ctx->rx.rec_seq, prot->rec_seq_size);
+		tls_bigint_increment(rcd_sn, prot->rec_seq_size);
+
+		tls_device_resync_rx(tls_ctx, sk, tcp_sk(sk)->copied_seq,
+				     rcd_sn);
+	}
 }
 
 static int tls_device_reencrypt(struct sock *sk, struct sk_buff *skb)
@@ -686,12 +757,21 @@ int tls_device_decrypted(struct sock *sk, struct sk_buff *skb)
 
 	ctx->sw.decrypted |= is_decrypted;
 
-	/* Return immedeatly if the record is either entirely plaintext or
+	/* Return immediately if the record is either entirely plaintext or
 	 * entirely ciphertext. Otherwise handle reencrypt partially decrypted
 	 * record.
 	 */
-	return (is_encrypted || is_decrypted) ? 0 :
-		tls_device_reencrypt(sk, skb);
+	if (is_decrypted) {
+		ctx->resync_nh_reset = 1;
+		return 0;
+	}
+	if (is_encrypted) {
+		tls_device_core_ctrl_rx_resync(tls_ctx, ctx, sk, skb);
+		return 0;
+	}
+
+	ctx->resync_nh_reset = 1;
+	return tls_device_reencrypt(sk, skb);
 }
 
 static void tls_device_attach(struct tls_context *ctx, struct sock *sk,
@@ -917,6 +997,7 @@ int tls_set_device_offload_rx(struct sock *sk, struct tls_context *ctx)
 		rc = -ENOMEM;
 		goto release_netdev;
 	}
+	context->resync_nh_reset = 1;
 
 	ctx->priv_ctx_rx = context;
 	rc = tls_set_sw_offload(sk, ctx, 0);
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index bc3a1b188d4a..533eaa4826e5 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -2015,7 +2015,7 @@ static int tls_read_size(struct strparser *strp, struct sk_buff *skb)
 		goto read_failure;
 	}
 #ifdef CONFIG_TLS_DEVICE
-	tls_device_rx_resync_new_rec(strp->sk,
+	tls_device_rx_resync_new_rec(strp->sk, data_len + TLS_HEADER_SIZE,
 				     TCP_SKB_CB(skb)->seq + rxm->offset);
 #endif
 	return data_len + TLS_HEADER_SIZE;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net-next 05/12] nfp: tls: set skb decrypted flag
  2019-06-11  4:39 [PATCH net-next 00/12] tls: add support for kernel-driven resync and nfp RX offload Jakub Kicinski
                   ` (3 preceding siblings ...)
  2019-06-11  4:40 ` [PATCH net-next 04/12] net/tls: add kernel-driven TLS RX resync Jakub Kicinski
@ 2019-06-11  4:40 ` Jakub Kicinski
  2019-06-11  4:40 ` [PATCH net-next 06/12] nfp: rename nfp_ccm_mbox_alloc() Jakub Kicinski
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Jakub Kicinski @ 2019-06-11  4:40 UTC (permalink / raw)
  To: davem
  Cc: netdev, oss-drivers, alexei.starovoitov, davejwatson, borisp,
	Dirk van der Merwe, Jakub Kicinski

From: Dirk van der Merwe <dirk.vandermerwe@netronome.com>

Firmware indicates when a packet has been decrypted by reusing the
currently unused BPF flag.  Transfer this information into the skb
and provide a statistic of all decrypted segments.

Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 drivers/net/ethernet/netronome/nfp/nfp_net.h   |  4 +++-
 .../ethernet/netronome/nfp/nfp_net_common.c    |  9 +++++++++
 .../ethernet/netronome/nfp/nfp_net_ethtool.c   | 18 ++++++++++--------
 3 files changed, 22 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net.h b/drivers/net/ethernet/netronome/nfp/nfp_net.h
index 661fa5941b91..7bfc819d1e85 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net.h
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net.h
@@ -240,7 +240,7 @@ struct nfp_net_tx_ring {
 #define PCIE_DESC_RX_I_TCP_CSUM_OK	cpu_to_le16(BIT(11))
 #define PCIE_DESC_RX_I_UDP_CSUM		cpu_to_le16(BIT(10))
 #define PCIE_DESC_RX_I_UDP_CSUM_OK	cpu_to_le16(BIT(9))
-#define PCIE_DESC_RX_BPF		cpu_to_le16(BIT(8))
+#define PCIE_DESC_RX_DECRYPTED		cpu_to_le16(BIT(8))
 #define PCIE_DESC_RX_EOP		cpu_to_le16(BIT(7))
 #define PCIE_DESC_RX_IP4_CSUM		cpu_to_le16(BIT(6))
 #define PCIE_DESC_RX_IP4_CSUM_OK	cpu_to_le16(BIT(5))
@@ -367,6 +367,7 @@ struct nfp_net_rx_ring {
  * @hw_csum_rx_inner_ok: Counter of packets where the inner HW checksum was OK
  * @hw_csum_rx_complete: Counter of packets with CHECKSUM_COMPLETE reported
  * @hw_csum_rx_error:	 Counter of packets with bad checksums
+ * @hw_tls_rx:	    Number of packets with TLS decrypted by hardware
  * @tx_sync:	    Seqlock for atomic updates of TX stats
  * @tx_pkts:	    Number of Transmitted packets
  * @tx_bytes:	    Number of Transmitted bytes
@@ -415,6 +416,7 @@ struct nfp_net_r_vector {
 	u64 hw_csum_rx_ok;
 	u64 hw_csum_rx_inner_ok;
 	u64 hw_csum_rx_complete;
+	u64 hw_tls_rx;
 
 	u64 hw_csum_rx_error;
 	u64 rx_replace_buf_alloc_fail;
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index e221847d9a3e..349678425aed 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -1951,6 +1951,15 @@ static int nfp_net_rx(struct nfp_net_rx_ring *rx_ring, int budget)
 
 		nfp_net_rx_csum(dp, r_vec, rxd, &meta, skb);
 
+#ifdef CONFIG_TLS_DEVICE
+		if (rxd->rxd.flags & PCIE_DESC_RX_DECRYPTED) {
+			skb->decrypted = true;
+			u64_stats_update_begin(&r_vec->rx_sync);
+			r_vec->hw_tls_rx++;
+			u64_stats_update_end(&r_vec->rx_sync);
+		}
+#endif
+
 		if (rxd->rxd.flags & PCIE_DESC_RX_VLAN)
 			__vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q),
 					       le16_to_cpu(rxd->rxd.vlan));
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c b/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
index 3a8e1af7042d..d9cbe84ac6ad 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
@@ -150,7 +150,7 @@ static const struct nfp_et_stat nfp_mac_et_stats[] = {
 
 #define NN_ET_GLOBAL_STATS_LEN ARRAY_SIZE(nfp_net_et_stats)
 #define NN_ET_SWITCH_STATS_LEN 9
-#define NN_RVEC_GATHER_STATS	12
+#define NN_RVEC_GATHER_STATS	13
 #define NN_RVEC_PER_Q_STATS	3
 #define NN_CTRL_PATH_STATS	1
 
@@ -444,6 +444,7 @@ static u8 *nfp_vnic_get_sw_stats_strings(struct net_device *netdev, u8 *data)
 	data = nfp_pr_et(data, "hw_rx_csum_complete");
 	data = nfp_pr_et(data, "hw_rx_csum_err");
 	data = nfp_pr_et(data, "rx_replace_buf_alloc_fail");
+	data = nfp_pr_et(data, "rx_tls_decrypted");
 	data = nfp_pr_et(data, "hw_tx_csum");
 	data = nfp_pr_et(data, "hw_tx_inner_csum");
 	data = nfp_pr_et(data, "tx_gather");
@@ -475,19 +476,20 @@ static u64 *nfp_vnic_get_sw_stats(struct net_device *netdev, u64 *data)
 			tmp[2] = nn->r_vecs[i].hw_csum_rx_complete;
 			tmp[3] = nn->r_vecs[i].hw_csum_rx_error;
 			tmp[4] = nn->r_vecs[i].rx_replace_buf_alloc_fail;
+			tmp[5] = nn->r_vecs[i].hw_tls_rx;
 		} while (u64_stats_fetch_retry(&nn->r_vecs[i].rx_sync, start));
 
 		do {
 			start = u64_stats_fetch_begin(&nn->r_vecs[i].tx_sync);
 			data[1] = nn->r_vecs[i].tx_pkts;
 			data[2] = nn->r_vecs[i].tx_busy;
-			tmp[5] = nn->r_vecs[i].hw_csum_tx;
-			tmp[6] = nn->r_vecs[i].hw_csum_tx_inner;
-			tmp[7] = nn->r_vecs[i].tx_gather;
-			tmp[8] = nn->r_vecs[i].tx_lso;
-			tmp[9] = nn->r_vecs[i].hw_tls_tx;
-			tmp[10] = nn->r_vecs[i].tls_tx_fallback;
-			tmp[11] = nn->r_vecs[i].tls_tx_no_fallback;
+			tmp[6] = nn->r_vecs[i].hw_csum_tx;
+			tmp[7] = nn->r_vecs[i].hw_csum_tx_inner;
+			tmp[8] = nn->r_vecs[i].tx_gather;
+			tmp[9] = nn->r_vecs[i].tx_lso;
+			tmp[10] = nn->r_vecs[i].hw_tls_tx;
+			tmp[11] = nn->r_vecs[i].tls_tx_fallback;
+			tmp[12] = nn->r_vecs[i].tls_tx_no_fallback;
 		} while (u64_stats_fetch_retry(&nn->r_vecs[i].tx_sync, start));
 
 		data += NN_RVEC_PER_Q_STATS;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net-next 06/12] nfp: rename nfp_ccm_mbox_alloc()
  2019-06-11  4:39 [PATCH net-next 00/12] tls: add support for kernel-driven resync and nfp RX offload Jakub Kicinski
                   ` (4 preceding siblings ...)
  2019-06-11  4:40 ` [PATCH net-next 05/12] nfp: tls: set skb decrypted flag Jakub Kicinski
@ 2019-06-11  4:40 ` Jakub Kicinski
  2019-06-11  4:40 ` [PATCH net-next 07/12] nfp: add async version of mailbox communication Jakub Kicinski
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Jakub Kicinski @ 2019-06-11  4:40 UTC (permalink / raw)
  To: davem
  Cc: netdev, oss-drivers, alexei.starovoitov, davejwatson, borisp,
	Jakub Kicinski, Dirk van der Merwe

We need the name nfp_ccm_mbox_alloc() for allocating the mailbox
communication channel itself.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
---
 drivers/net/ethernet/netronome/nfp/ccm.h        | 4 ++--
 drivers/net/ethernet/netronome/nfp/ccm_mbox.c   | 4 ++--
 drivers/net/ethernet/netronome/nfp/crypto/tls.c | 8 ++++----
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/ccm.h b/drivers/net/ethernet/netronome/nfp/ccm.h
index 01efa779ab31..c905898ab26e 100644
--- a/drivers/net/ethernet/netronome/nfp/ccm.h
+++ b/drivers/net/ethernet/netronome/nfp/ccm.h
@@ -112,8 +112,8 @@ nfp_ccm_communicate(struct nfp_ccm *ccm, struct sk_buff *skb,
 
 bool nfp_ccm_mbox_fits(struct nfp_net *nn, unsigned int size);
 struct sk_buff *
-nfp_ccm_mbox_alloc(struct nfp_net *nn, unsigned int req_size,
-		   unsigned int reply_size, gfp_t flags);
+nfp_ccm_mbox_msg_alloc(struct nfp_net *nn, unsigned int req_size,
+		       unsigned int reply_size, gfp_t flags);
 int nfp_ccm_mbox_communicate(struct nfp_net *nn, struct sk_buff *skb,
 			     enum nfp_ccm_type type,
 			     unsigned int reply_size,
diff --git a/drivers/net/ethernet/netronome/nfp/ccm_mbox.c b/drivers/net/ethernet/netronome/nfp/ccm_mbox.c
index e5acd96c3335..53995d53aa3f 100644
--- a/drivers/net/ethernet/netronome/nfp/ccm_mbox.c
+++ b/drivers/net/ethernet/netronome/nfp/ccm_mbox.c
@@ -564,8 +564,8 @@ int nfp_ccm_mbox_communicate(struct nfp_net *nn, struct sk_buff *skb,
 }
 
 struct sk_buff *
-nfp_ccm_mbox_alloc(struct nfp_net *nn, unsigned int req_size,
-		   unsigned int reply_size, gfp_t flags)
+nfp_ccm_mbox_msg_alloc(struct nfp_net *nn, unsigned int req_size,
+		       unsigned int reply_size, gfp_t flags)
 {
 	unsigned int max_size;
 	struct sk_buff *skb;
diff --git a/drivers/net/ethernet/netronome/nfp/crypto/tls.c b/drivers/net/ethernet/netronome/nfp/crypto/tls.c
index c638223e9f60..b7d7317d71d1 100644
--- a/drivers/net/ethernet/netronome/nfp/crypto/tls.c
+++ b/drivers/net/ethernet/netronome/nfp/crypto/tls.c
@@ -94,9 +94,9 @@ nfp_net_tls_conn_remove(struct nfp_net *nn, enum tls_offload_ctx_dir direction)
 static struct sk_buff *
 nfp_net_tls_alloc_simple(struct nfp_net *nn, size_t req_sz, gfp_t flags)
 {
-	return nfp_ccm_mbox_alloc(nn, req_sz,
-				  sizeof(struct nfp_crypto_reply_simple),
-				  flags);
+	return nfp_ccm_mbox_msg_alloc(nn, req_sz,
+				      sizeof(struct nfp_crypto_reply_simple),
+				      flags);
 }
 
 static int
@@ -283,7 +283,7 @@ nfp_net_tls_add(struct net_device *netdev, struct sock *sk,
 	if (err)
 		return err;
 
-	skb = nfp_ccm_mbox_alloc(nn, req_sz, sizeof(*reply), GFP_KERNEL);
+	skb = nfp_ccm_mbox_msg_alloc(nn, req_sz, sizeof(*reply), GFP_KERNEL);
 	if (!skb) {
 		err = -ENOMEM;
 		goto err_conn_remove;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net-next 07/12] nfp: add async version of mailbox communication
  2019-06-11  4:39 [PATCH net-next 00/12] tls: add support for kernel-driven resync and nfp RX offload Jakub Kicinski
                   ` (5 preceding siblings ...)
  2019-06-11  4:40 ` [PATCH net-next 06/12] nfp: rename nfp_ccm_mbox_alloc() Jakub Kicinski
@ 2019-06-11  4:40 ` Jakub Kicinski
  2019-06-11  4:40 ` [PATCH net-next 08/12] nfp: tls: implement RX TLS resync Jakub Kicinski
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Jakub Kicinski @ 2019-06-11  4:40 UTC (permalink / raw)
  To: davem
  Cc: netdev, oss-drivers, alexei.starovoitov, davejwatson, borisp,
	Jakub Kicinski, Dirk van der Merwe

Some control messages must be sent from atomic context.  The mailbox
takes sleeping locks and uses a waitqueue so add a "posted" version
of communication.

Trylock the semaphore and if that's successful kick of the device
communication.  The device communication will be completed from
a workqueue, which will also release the semaphore.

If locks are taken queue the message and return.  Schedule a
different workqueue to take the semaphore and run the communication.
Note that the there are currently no atomic users which would actually
need the return value, so all replies to posted messages are just
freed.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
---
 drivers/net/ethernet/netronome/nfp/ccm.h      |   8 +-
 drivers/net/ethernet/netronome/nfp/ccm_mbox.c | 175 ++++++++++++++++--
 drivers/net/ethernet/netronome/nfp/nfp_net.h  |  14 ++
 .../ethernet/netronome/nfp/nfp_net_common.c   |  40 +++-
 4 files changed, 215 insertions(+), 22 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/ccm.h b/drivers/net/ethernet/netronome/nfp/ccm.h
index c905898ab26e..da1b1e20df51 100644
--- a/drivers/net/ethernet/netronome/nfp/ccm.h
+++ b/drivers/net/ethernet/netronome/nfp/ccm.h
@@ -100,7 +100,7 @@ struct nfp_ccm {
 	u16 tag_alloc_last;
 
 	struct sk_buff_head replies;
-	struct wait_queue_head wq;
+	wait_queue_head_t wq;
 };
 
 int nfp_ccm_init(struct nfp_ccm *ccm, struct nfp_app *app);
@@ -110,6 +110,10 @@ struct sk_buff *
 nfp_ccm_communicate(struct nfp_ccm *ccm, struct sk_buff *skb,
 		    enum nfp_ccm_type type, unsigned int reply_size);
 
+int nfp_ccm_mbox_alloc(struct nfp_net *nn);
+void nfp_ccm_mbox_free(struct nfp_net *nn);
+int nfp_ccm_mbox_init(struct nfp_net *nn);
+void nfp_ccm_mbox_clean(struct nfp_net *nn);
 bool nfp_ccm_mbox_fits(struct nfp_net *nn, unsigned int size);
 struct sk_buff *
 nfp_ccm_mbox_msg_alloc(struct nfp_net *nn, unsigned int req_size,
@@ -118,4 +122,6 @@ int nfp_ccm_mbox_communicate(struct nfp_net *nn, struct sk_buff *skb,
 			     enum nfp_ccm_type type,
 			     unsigned int reply_size,
 			     unsigned int max_reply_size);
+int nfp_ccm_mbox_post(struct nfp_net *nn, struct sk_buff *skb,
+		      enum nfp_ccm_type type, unsigned int max_reply_size);
 #endif
diff --git a/drivers/net/ethernet/netronome/nfp/ccm_mbox.c b/drivers/net/ethernet/netronome/nfp/ccm_mbox.c
index 53995d53aa3f..02fccd90961d 100644
--- a/drivers/net/ethernet/netronome/nfp/ccm_mbox.c
+++ b/drivers/net/ethernet/netronome/nfp/ccm_mbox.c
@@ -41,12 +41,14 @@ enum nfp_net_mbox_cmsg_state {
  * @err:	error encountered during processing if any
  * @max_len:	max(request_len, reply_len)
  * @exp_reply:	expected reply length (0 means don't validate)
+ * @posted:	the message was posted and nobody waits for the reply
  */
 struct nfp_ccm_mbox_cmsg_cb {
 	enum nfp_net_mbox_cmsg_state state;
 	int err;
 	unsigned int max_len;
 	unsigned int exp_reply;
+	bool posted;
 };
 
 static u32 nfp_ccm_mbox_max_msg(struct nfp_net *nn)
@@ -65,6 +67,7 @@ nfp_ccm_mbox_msg_init(struct sk_buff *skb, unsigned int exp_reply, int max_len)
 	cb->err = 0;
 	cb->max_len = max_len;
 	cb->exp_reply = exp_reply;
+	cb->posted = false;
 }
 
 static int nfp_ccm_mbox_maxlen(const struct sk_buff *skb)
@@ -96,6 +99,20 @@ static void nfp_ccm_mbox_set_busy(struct sk_buff *skb)
 	cb->state = NFP_NET_MBOX_CMSG_STATE_BUSY;
 }
 
+static bool nfp_ccm_mbox_is_posted(struct sk_buff *skb)
+{
+	struct nfp_ccm_mbox_cmsg_cb *cb = (void *)skb->cb;
+
+	return cb->posted;
+}
+
+static void nfp_ccm_mbox_mark_posted(struct sk_buff *skb)
+{
+	struct nfp_ccm_mbox_cmsg_cb *cb = (void *)skb->cb;
+
+	cb->posted = true;
+}
+
 static bool nfp_ccm_mbox_is_first(struct nfp_net *nn, struct sk_buff *skb)
 {
 	return skb_queue_is_first(&nn->mbox_cmsg.queue, skb);
@@ -119,6 +136,8 @@ static void nfp_ccm_mbox_mark_next_runner(struct nfp_net *nn)
 
 	cb = (void *)skb->cb;
 	cb->state = NFP_NET_MBOX_CMSG_STATE_NEXT;
+	if (cb->posted)
+		queue_work(nn->mbox_cmsg.workq, &nn->mbox_cmsg.runq_work);
 }
 
 static void
@@ -205,9 +224,7 @@ static void nfp_ccm_mbox_copy_out(struct nfp_net *nn, struct sk_buff *last)
 	while (true) {
 		unsigned int length, offset, type;
 		struct nfp_ccm_hdr hdr;
-		__be32 *skb_data;
 		u32 tlv_hdr;
-		int i, cnt;
 
 		tlv_hdr = readl(data);
 		type = FIELD_GET(NFP_NET_MBOX_TLV_TYPE, tlv_hdr);
@@ -278,20 +295,26 @@ static void nfp_ccm_mbox_copy_out(struct nfp_net *nn, struct sk_buff *last)
 			goto next_tlv;
 		}
 
-		if (length <= skb->len)
-			__skb_trim(skb, length);
-		else
-			skb_put(skb, length - skb->len);
-
-		/* We overcopy here slightly, but that's okay, the skb is large
-		 * enough, and the garbage will be ignored (beyond skb->len).
-		 */
-		skb_data = (__be32 *)skb->data;
-		memcpy(skb_data, &hdr, 4);
-
-		cnt = DIV_ROUND_UP(length, 4);
-		for (i = 1 ; i < cnt; i++)
-			skb_data[i] = cpu_to_be32(readl(data + i * 4));
+		if (!cb->posted) {
+			__be32 *skb_data;
+			int i, cnt;
+
+			if (length <= skb->len)
+				__skb_trim(skb, length);
+			else
+				skb_put(skb, length - skb->len);
+
+			/* We overcopy here slightly, but that's okay,
+			 * the skb is large enough, and the garbage will
+			 * be ignored (beyond skb->len).
+			 */
+			skb_data = (__be32 *)skb->data;
+			memcpy(skb_data, &hdr, 4);
+
+			cnt = DIV_ROUND_UP(length, 4);
+			for (i = 1 ; i < cnt; i++)
+				skb_data[i] = cpu_to_be32(readl(data + i * 4));
+		}
 
 		cb->state = NFP_NET_MBOX_CMSG_STATE_REPLY_FOUND;
 next_tlv:
@@ -314,6 +337,14 @@ static void nfp_ccm_mbox_copy_out(struct nfp_net *nn, struct sk_buff *last)
 			smp_wmb(); /* order the cb->err vs. cb->state */
 		}
 		cb->state = NFP_NET_MBOX_CMSG_STATE_DONE;
+
+		if (cb->posted) {
+			if (cb->err)
+				nn_dp_warn(&nn->dp,
+					   "mailbox posted msg failed type:%u err:%d\n",
+					   nfp_ccm_get_type(skb), cb->err);
+			dev_consume_skb_any(skb);
+		}
 	} while (skb != last);
 
 	nfp_ccm_mbox_mark_next_runner(nn);
@@ -563,6 +594,89 @@ int nfp_ccm_mbox_communicate(struct nfp_net *nn, struct sk_buff *skb,
 	return err;
 }
 
+static void nfp_ccm_mbox_post_runq_work(struct work_struct *work)
+{
+	struct sk_buff *skb;
+	struct nfp_net *nn;
+
+	nn = container_of(work, struct nfp_net, mbox_cmsg.runq_work);
+
+	spin_lock_bh(&nn->mbox_cmsg.queue.lock);
+
+	skb = __skb_peek(&nn->mbox_cmsg.queue);
+	if (WARN_ON(!skb || !nfp_ccm_mbox_is_posted(skb) ||
+		    !nfp_ccm_mbox_should_run(nn, skb))) {
+		spin_unlock_bh(&nn->mbox_cmsg.queue.lock);
+		return;
+	}
+
+	nfp_ccm_mbox_run_queue_unlock(nn);
+}
+
+static void nfp_ccm_mbox_post_wait_work(struct work_struct *work)
+{
+	struct sk_buff *skb;
+	struct nfp_net *nn;
+	int err;
+
+	nn = container_of(work, struct nfp_net, mbox_cmsg.wait_work);
+
+	skb = skb_peek(&nn->mbox_cmsg.queue);
+	if (WARN_ON(!skb || !nfp_ccm_mbox_is_posted(skb)))
+		/* Should never happen so it's unclear what to do here.. */
+		goto exit_unlock_wake;
+
+	err = nfp_net_mbox_reconfig_wait_posted(nn);
+	if (!err)
+		nfp_ccm_mbox_copy_out(nn, skb);
+	else
+		nfp_ccm_mbox_mark_all_err(nn, skb, -EIO);
+exit_unlock_wake:
+	nn_ctrl_bar_unlock(nn);
+	wake_up_all(&nn->mbox_cmsg.wq);
+}
+
+int nfp_ccm_mbox_post(struct nfp_net *nn, struct sk_buff *skb,
+		      enum nfp_ccm_type type, unsigned int max_reply_size)
+{
+	int err;
+
+	err = nfp_ccm_mbox_msg_prepare(nn, skb, type, 0, max_reply_size,
+				       GFP_ATOMIC);
+	if (err)
+		goto err_free_skb;
+
+	nfp_ccm_mbox_mark_posted(skb);
+
+	spin_lock_bh(&nn->mbox_cmsg.queue.lock);
+
+	err = nfp_ccm_mbox_msg_enqueue(nn, skb, type);
+	if (err)
+		goto err_unlock;
+
+	if (nfp_ccm_mbox_is_first(nn, skb)) {
+		if (nn_ctrl_bar_trylock(nn)) {
+			nfp_ccm_mbox_copy_in(nn, skb);
+			nfp_net_mbox_reconfig_post(nn,
+						   NFP_NET_CFG_MBOX_CMD_TLV_CMSG);
+			queue_work(nn->mbox_cmsg.workq,
+				   &nn->mbox_cmsg.wait_work);
+		} else {
+			nfp_ccm_mbox_mark_next_runner(nn);
+		}
+	}
+
+	spin_unlock_bh(&nn->mbox_cmsg.queue.lock);
+
+	return 0;
+
+err_unlock:
+	spin_unlock_bh(&nn->mbox_cmsg.queue.lock);
+err_free_skb:
+	dev_kfree_skb_any(skb);
+	return err;
+}
+
 struct sk_buff *
 nfp_ccm_mbox_msg_alloc(struct nfp_net *nn, unsigned int req_size,
 		       unsigned int reply_size, gfp_t flags)
@@ -589,3 +703,32 @@ bool nfp_ccm_mbox_fits(struct nfp_net *nn, unsigned int size)
 {
 	return nfp_ccm_mbox_max_msg(nn) >= size;
 }
+
+int nfp_ccm_mbox_init(struct nfp_net *nn)
+{
+	return 0;
+}
+
+void nfp_ccm_mbox_clean(struct nfp_net *nn)
+{
+	drain_workqueue(nn->mbox_cmsg.workq);
+}
+
+int nfp_ccm_mbox_alloc(struct nfp_net *nn)
+{
+	skb_queue_head_init(&nn->mbox_cmsg.queue);
+	init_waitqueue_head(&nn->mbox_cmsg.wq);
+	INIT_WORK(&nn->mbox_cmsg.wait_work, nfp_ccm_mbox_post_wait_work);
+	INIT_WORK(&nn->mbox_cmsg.runq_work, nfp_ccm_mbox_post_runq_work);
+
+	nn->mbox_cmsg.workq = alloc_workqueue("nfp-ccm-mbox", WQ_UNBOUND, 0);
+	if (!nn->mbox_cmsg.workq)
+		return -ENOMEM;
+	return 0;
+}
+
+void nfp_ccm_mbox_free(struct nfp_net *nn)
+{
+	destroy_workqueue(nn->mbox_cmsg.workq);
+	WARN_ON(!skb_queue_empty(&nn->mbox_cmsg.queue));
+}
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net.h b/drivers/net/ethernet/netronome/nfp/nfp_net.h
index 7bfc819d1e85..46305f181764 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net.h
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net.h
@@ -19,6 +19,7 @@
 #include <linux/pci.h>
 #include <linux/io-64-nonatomic-hi-lo.h>
 #include <linux/semaphore.h>
+#include <linux/workqueue.h>
 #include <net/xdp.h>
 
 #include "nfp_net_ctrl.h"
@@ -586,6 +587,9 @@ struct nfp_net_dp {
  * @mbox_cmsg:		Common Control Message via vNIC mailbox state
  * @mbox_cmsg.queue:	CCM mbox queue of pending messages
  * @mbox_cmsg.wq:	CCM mbox wait queue of waiting processes
+ * @mbox_cmsg.workq:	CCM mbox work queue for @wait_work and @runq_work
+ * @mbox_cmsg.wait_work:    CCM mbox posted msg reconfig wait work
+ * @mbox_cmsg.runq_work:    CCM mbox posted msg queue runner work
  * @mbox_cmsg.tag:	CCM mbox message tag allocator
  * @debugfs_dir:	Device directory in debugfs
  * @vnic_list:		Entry on device vNIC list
@@ -669,6 +673,9 @@ struct nfp_net {
 	struct {
 		struct sk_buff_head queue;
 		wait_queue_head_t wq;
+		struct workqueue_struct *workq;
+		struct work_struct wait_work;
+		struct work_struct runq_work;
 		u16 tag;
 	} mbox_cmsg;
 
@@ -886,6 +893,11 @@ static inline void nn_ctrl_bar_lock(struct nfp_net *nn)
 	down(&nn->bar_lock);
 }
 
+static inline bool nn_ctrl_bar_trylock(struct nfp_net *nn)
+{
+	return !down_trylock(&nn->bar_lock);
+}
+
 static inline void nn_ctrl_bar_unlock(struct nfp_net *nn)
 {
 	up(&nn->bar_lock);
@@ -927,6 +939,8 @@ void nfp_net_coalesce_write_cfg(struct nfp_net *nn);
 int nfp_net_mbox_lock(struct nfp_net *nn, unsigned int data_size);
 int nfp_net_mbox_reconfig(struct nfp_net *nn, u32 mbox_cmd);
 int nfp_net_mbox_reconfig_and_unlock(struct nfp_net *nn, u32 mbox_cmd);
+void nfp_net_mbox_reconfig_post(struct nfp_net *nn, u32 update);
+int nfp_net_mbox_reconfig_wait_posted(struct nfp_net *nn);
 
 unsigned int
 nfp_net_irqs_alloc(struct pci_dev *pdev, struct msix_entry *irq_entries,
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index 349678425aed..c9c43abb2427 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -40,6 +40,7 @@
 #include <net/vxlan.h>
 
 #include "nfpcore/nfp_nsp.h"
+#include "ccm.h"
 #include "nfp_app.h"
 #include "nfp_net_ctrl.h"
 #include "nfp_net.h"
@@ -229,6 +230,7 @@ static void nfp_net_reconfig_sync_enter(struct nfp_net *nn)
 
 	spin_lock_bh(&nn->reconfig_lock);
 
+	WARN_ON(nn->reconfig_sync_present);
 	nn->reconfig_sync_present = true;
 
 	if (nn->reconfig_timer_active) {
@@ -341,6 +343,24 @@ int nfp_net_mbox_reconfig(struct nfp_net *nn, u32 mbox_cmd)
 	return -nn_readl(nn, mbox + NFP_NET_CFG_MBOX_SIMPLE_RET);
 }
 
+void nfp_net_mbox_reconfig_post(struct nfp_net *nn, u32 mbox_cmd)
+{
+	u32 mbox = nn->tlv_caps.mbox_off;
+
+	nn_writeq(nn, mbox + NFP_NET_CFG_MBOX_SIMPLE_CMD, mbox_cmd);
+
+	nfp_net_reconfig_post(nn, NFP_NET_CFG_UPDATE_MBOX);
+}
+
+int nfp_net_mbox_reconfig_wait_posted(struct nfp_net *nn)
+{
+	u32 mbox = nn->tlv_caps.mbox_off;
+
+	nfp_net_reconfig_wait_posted(nn);
+
+	return -nn_readl(nn, mbox + NFP_NET_CFG_MBOX_SIMPLE_RET);
+}
+
 int nfp_net_mbox_reconfig_and_unlock(struct nfp_net *nn, u32 mbox_cmd)
 {
 	int ret;
@@ -3814,14 +3834,15 @@ nfp_net_alloc(struct pci_dev *pdev, void __iomem *ctrl_bar, bool needs_netdev,
 
 	timer_setup(&nn->reconfig_timer, nfp_net_reconfig_timer, 0);
 
-	skb_queue_head_init(&nn->mbox_cmsg.queue);
-	init_waitqueue_head(&nn->mbox_cmsg.wq);
-
 	err = nfp_net_tlv_caps_parse(&nn->pdev->dev, nn->dp.ctrl_bar,
 				     &nn->tlv_caps);
 	if (err)
 		goto err_free_nn;
 
+	err = nfp_ccm_mbox_alloc(nn);
+	if (err)
+		goto err_free_nn;
+
 	return nn;
 
 err_free_nn:
@@ -3839,7 +3860,7 @@ nfp_net_alloc(struct pci_dev *pdev, void __iomem *ctrl_bar, bool needs_netdev,
 void nfp_net_free(struct nfp_net *nn)
 {
 	WARN_ON(timer_pending(&nn->reconfig_timer) || nn->reconfig_posted);
-	WARN_ON(!skb_queue_empty(&nn->mbox_cmsg.queue));
+	nfp_ccm_mbox_free(nn);
 
 	if (nn->dp.netdev)
 		free_netdev(nn->dp.netdev);
@@ -4117,9 +4138,13 @@ int nfp_net_init(struct nfp_net *nn)
 	if (nn->dp.netdev) {
 		nfp_net_netdev_init(nn);
 
-		err = nfp_net_tls_init(nn);
+		err = nfp_ccm_mbox_init(nn);
 		if (err)
 			return err;
+
+		err = nfp_net_tls_init(nn);
+		if (err)
+			goto err_clean_mbox;
 	}
 
 	nfp_net_vecs_init(nn);
@@ -4127,6 +4152,10 @@ int nfp_net_init(struct nfp_net *nn)
 	if (!nn->dp.netdev)
 		return 0;
 	return register_netdev(nn->dp.netdev);
+
+err_clean_mbox:
+	nfp_ccm_mbox_clean(nn);
+	return err;
 }
 
 /**
@@ -4139,5 +4168,6 @@ void nfp_net_clean(struct nfp_net *nn)
 		return;
 
 	unregister_netdev(nn->dp.netdev);
+	nfp_ccm_mbox_clean(nn);
 	nfp_net_reconfig_wait_posted(nn);
 }
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net-next 08/12] nfp: tls: implement RX TLS resync
  2019-06-11  4:39 [PATCH net-next 00/12] tls: add support for kernel-driven resync and nfp RX offload Jakub Kicinski
                   ` (6 preceding siblings ...)
  2019-06-11  4:40 ` [PATCH net-next 07/12] nfp: add async version of mailbox communication Jakub Kicinski
@ 2019-06-11  4:40 ` Jakub Kicinski
  2019-06-11  4:40 ` [PATCH net-next 09/12] nfp: tls: enable TLS RX offload Jakub Kicinski
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Jakub Kicinski @ 2019-06-11  4:40 UTC (permalink / raw)
  To: davem
  Cc: netdev, oss-drivers, alexei.starovoitov, davejwatson, borisp,
	Dirk van der Merwe, Jakub Kicinski

From: Dirk van der Merwe <dirk.vandermerwe@netronome.com>

Enable kernel-controlled RX resync and propagate TLS connection
RX resync from kernel TLS to firmware.

Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 .../net/ethernet/netronome/nfp/crypto/tls.c   | 32 +++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/drivers/net/ethernet/netronome/nfp/crypto/tls.c b/drivers/net/ethernet/netronome/nfp/crypto/tls.c
index b7d7317d71d1..eebaf5e1621d 100644
--- a/drivers/net/ethernet/netronome/nfp/crypto/tls.c
+++ b/drivers/net/ethernet/netronome/nfp/crypto/tls.c
@@ -344,6 +344,11 @@ nfp_net_tls_add(struct net_device *netdev, struct sock *sk,
 	ntls->next_seq = start_offload_tcp_sn;
 	dev_consume_skb_any(skb);
 
+	if (direction == TLS_OFFLOAD_CTX_DIR_TX)
+		return 0;
+
+	tls_offload_rx_resync_set_type(sk,
+				       TLS_OFFLOAD_SYNC_TYPE_CORE_NEXT_HINT);
 	return 0;
 
 err_fw_remove:
@@ -368,9 +373,36 @@ nfp_net_tls_del(struct net_device *netdev, struct tls_context *tls_ctx,
 	nfp_net_tls_del_fw(nn, ntls->fw_handle);
 }
 
+static void
+nfp_net_tls_resync_rx(struct net_device *netdev, struct sock *sk, u32 seq,
+		      u8 *rcd_sn)
+{
+	struct nfp_net *nn = netdev_priv(netdev);
+	struct nfp_net_tls_offload_ctx *ntls;
+	struct nfp_crypto_req_update *req;
+	struct sk_buff *skb;
+
+	skb = nfp_net_tls_alloc_simple(nn, sizeof(*req), GFP_ATOMIC);
+	if (!skb)
+		return;
+
+	ntls = tls_driver_ctx(sk, TLS_OFFLOAD_CTX_DIR_RX);
+	req = (void *)skb->data;
+	req->ep_id = 0;
+	req->opcode = NFP_NET_CRYPTO_OP_TLS_1_2_AES_GCM_128_DEC;
+	memset(req->resv, 0, sizeof(req->resv));
+	memcpy(req->handle, ntls->fw_handle, sizeof(ntls->fw_handle));
+	req->tcp_seq = cpu_to_be32(seq);
+	memcpy(req->rec_no, rcd_sn, sizeof(req->rec_no));
+
+	nfp_ccm_mbox_post(nn, skb, NFP_CCM_TYPE_CRYPTO_UPDATE,
+			  sizeof(struct nfp_crypto_reply_simple));
+}
+
 static const struct tlsdev_ops nfp_net_tls_ops = {
 	.tls_dev_add = nfp_net_tls_add,
 	.tls_dev_del = nfp_net_tls_del,
+	.tls_dev_resync_rx = nfp_net_tls_resync_rx,
 };
 
 static int nfp_net_tls_reset(struct nfp_net *nn)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net-next 09/12] nfp: tls: enable TLS RX offload
  2019-06-11  4:39 [PATCH net-next 00/12] tls: add support for kernel-driven resync and nfp RX offload Jakub Kicinski
                   ` (7 preceding siblings ...)
  2019-06-11  4:40 ` [PATCH net-next 08/12] nfp: tls: implement RX TLS resync Jakub Kicinski
@ 2019-06-11  4:40 ` Jakub Kicinski
  2019-06-11  4:40 ` [PATCH net-next 10/12] net/tls: generalize the resync callback Jakub Kicinski
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Jakub Kicinski @ 2019-06-11  4:40 UTC (permalink / raw)
  To: davem
  Cc: netdev, oss-drivers, alexei.starovoitov, davejwatson, borisp,
	Jakub Kicinski, Dirk van der Merwe

Set ethtool TLS RX feature based on NIC capabilities, and enable
TLS RX when connections are added for decryption.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
---
 .../ethernet/netronome/nfp/crypto/crypto.h    |  5 ++++
 .../net/ethernet/netronome/nfp/crypto/tls.c   | 25 ++++++++++++++-----
 drivers/net/ethernet/netronome/nfp/nfp_net.h  |  2 ++
 3 files changed, 26 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/crypto/crypto.h b/drivers/net/ethernet/netronome/nfp/crypto/crypto.h
index 1f97fb443134..591924ad920c 100644
--- a/drivers/net/ethernet/netronome/nfp/crypto/crypto.h
+++ b/drivers/net/ethernet/netronome/nfp/crypto/crypto.h
@@ -7,6 +7,11 @@
 struct nfp_net_tls_offload_ctx {
 	__be32 fw_handle[2];
 
+	u8 rx_end[0];
+	/* Tx only fields follow - Rx side does not have enough driver state
+	 * to fit these
+	 */
+
 	u32 next_seq;
 	bool out_of_sync;
 };
diff --git a/drivers/net/ethernet/netronome/nfp/crypto/tls.c b/drivers/net/ethernet/netronome/nfp/crypto/tls.c
index eebaf5e1621d..4427c1d42047 100644
--- a/drivers/net/ethernet/netronome/nfp/crypto/tls.c
+++ b/drivers/net/ethernet/netronome/nfp/crypto/tls.c
@@ -47,10 +47,16 @@ __nfp_net_tls_conn_cnt_changed(struct nfp_net *nn, int add,
 	u8 opcode;
 	int cnt;
 
-	opcode = NFP_NET_CRYPTO_OP_TLS_1_2_AES_GCM_128_ENC;
-	nn->ktls_tx_conn_cnt += add;
-	cnt = nn->ktls_tx_conn_cnt;
-	nn->dp.ktls_tx = !!nn->ktls_tx_conn_cnt;
+	if (direction == TLS_OFFLOAD_CTX_DIR_TX) {
+		opcode = NFP_NET_CRYPTO_OP_TLS_1_2_AES_GCM_128_ENC;
+		nn->ktls_tx_conn_cnt += add;
+		cnt = nn->ktls_tx_conn_cnt;
+		nn->dp.ktls_tx = !!nn->ktls_tx_conn_cnt;
+	} else {
+		opcode = NFP_NET_CRYPTO_OP_TLS_1_2_AES_GCM_128_DEC;
+		nn->ktls_rx_conn_cnt += add;
+		cnt = nn->ktls_rx_conn_cnt;
+	}
 
 	/* Care only about 0 -> 1 and 1 -> 0 transitions */
 	if (cnt > 1)
@@ -228,7 +234,7 @@ nfp_net_cipher_supported(struct nfp_net *nn, u16 cipher_type,
 		if (direction == TLS_OFFLOAD_CTX_DIR_TX)
 			bit = NFP_NET_CRYPTO_OP_TLS_1_2_AES_GCM_128_ENC;
 		else
-			return false;
+			bit = NFP_NET_CRYPTO_OP_TLS_1_2_AES_GCM_128_DEC;
 		break;
 	default:
 		return false;
@@ -256,6 +262,8 @@ nfp_net_tls_add(struct net_device *netdev, struct sock *sk,
 
 	BUILD_BUG_ON(sizeof(struct nfp_net_tls_offload_ctx) >
 		     TLS_DRIVER_STATE_SIZE_TX);
+	BUILD_BUG_ON(offsetof(struct nfp_net_tls_offload_ctx, rx_end) >
+		     TLS_DRIVER_STATE_SIZE_RX);
 
 	if (!nfp_net_cipher_supported(nn, crypto_info->cipher_type, direction))
 		return -EOPNOTSUPP;
@@ -341,7 +349,8 @@ nfp_net_tls_add(struct net_device *netdev, struct sock *sk,
 
 	ntls = tls_driver_ctx(sk, direction);
 	memcpy(ntls->fw_handle, reply->handle, sizeof(ntls->fw_handle));
-	ntls->next_seq = start_offload_tcp_sn;
+	if (direction == TLS_OFFLOAD_CTX_DIR_TX)
+		ntls->next_seq = start_offload_tcp_sn;
 	dev_consume_skb_any(skb);
 
 	if (direction == TLS_OFFLOAD_CTX_DIR_TX)
@@ -450,6 +459,10 @@ int nfp_net_tls_init(struct nfp_net *nn)
 	if (err)
 		return err;
 
+	if (nn->tlv_caps.crypto_ops & NFP_NET_TLS_OPCODE_MASK_RX) {
+		netdev->hw_features |= NETIF_F_HW_TLS_RX;
+		netdev->features |= NETIF_F_HW_TLS_RX;
+	}
 	if (nn->tlv_caps.crypto_ops & NFP_NET_TLS_OPCODE_MASK_TX) {
 		netdev->hw_features |= NETIF_F_HW_TLS_TX;
 		netdev->features |= NETIF_F_HW_TLS_TX;
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net.h b/drivers/net/ethernet/netronome/nfp/nfp_net.h
index 46305f181764..6bbd77ba56f2 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net.h
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net.h
@@ -582,6 +582,7 @@ struct nfp_net_dp {
  * @rx_bar:             Pointer to mapped FL/RX queues
  * @tlv_caps:		Parsed TLV capabilities
  * @ktls_tx_conn_cnt:	Number of offloaded kTLS TX connections
+ * @ktls_rx_conn_cnt:	Number of offloaded kTLS RX connections
  * @ktls_no_space:	Counter of firmware rejecting kTLS connection due to
  *			lack of space
  * @mbox_cmsg:		Common Control Message via vNIC mailbox state
@@ -667,6 +668,7 @@ struct nfp_net {
 	struct nfp_net_tlv_caps tlv_caps;
 
 	unsigned int ktls_tx_conn_cnt;
+	unsigned int ktls_rx_conn_cnt;
 
 	atomic_t ktls_no_space;
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net-next 10/12] net/tls: generalize the resync callback
  2019-06-11  4:39 [PATCH net-next 00/12] tls: add support for kernel-driven resync and nfp RX offload Jakub Kicinski
                   ` (8 preceding siblings ...)
  2019-06-11  4:40 ` [PATCH net-next 09/12] nfp: tls: enable TLS RX offload Jakub Kicinski
@ 2019-06-11  4:40 ` Jakub Kicinski
  2019-06-11  4:40 ` [PATCH net-next 11/12] net/tls: add kernel-driven resync mechanism for TX Jakub Kicinski
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Jakub Kicinski @ 2019-06-11  4:40 UTC (permalink / raw)
  To: davem
  Cc: netdev, oss-drivers, alexei.starovoitov, davejwatson, borisp,
	Jakub Kicinski, Dirk van der Merwe

Currently only RX direction is ever resynced, however, TX may
also get out of sequence if packets get dropped on the way to
the driver.  Rename the resync callback and add a direction
parameter.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls.c | 9 ++++++---
 drivers/net/ethernet/netronome/nfp/crypto/tls.c        | 9 ++++++---
 include/net/tls.h                                      | 5 +++--
 net/tls/tls_device.c                                   | 5 +++--
 4 files changed, 18 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls.c
index d65150aa8298..dc15c5c9e557 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls.c
@@ -160,14 +160,17 @@ static void mlx5e_tls_del(struct net_device *netdev,
 				direction == TLS_OFFLOAD_CTX_DIR_TX);
 }
 
-static void mlx5e_tls_resync_rx(struct net_device *netdev, struct sock *sk,
-				u32 seq, u8 *rcd_sn_data)
+static void mlx5e_tls_resync(struct net_device *netdev, struct sock *sk,
+			     u32 seq, u8 *rcd_sn_data,
+			     enum tls_offload_ctx_dir direction)
 {
 	struct tls_context *tls_ctx = tls_get_ctx(sk);
 	struct mlx5e_priv *priv = netdev_priv(netdev);
 	struct mlx5e_tls_offload_context_rx *rx_ctx;
 	u64 rcd_sn = *(u64 *)rcd_sn_data;
 
+	if (WARN_ON_ONCE(direction != TLS_OFFLOAD_CTX_DIR_RX))
+		return;
 	rx_ctx = mlx5e_get_tls_rx_context(tls_ctx);
 
 	netdev_info(netdev, "resyncing seq %d rcd %lld\n", seq,
@@ -179,7 +182,7 @@ static void mlx5e_tls_resync_rx(struct net_device *netdev, struct sock *sk,
 static const struct tlsdev_ops mlx5e_tls_ops = {
 	.tls_dev_add = mlx5e_tls_add,
 	.tls_dev_del = mlx5e_tls_del,
-	.tls_dev_resync_rx = mlx5e_tls_resync_rx,
+	.tls_dev_resync = mlx5e_tls_resync,
 };
 
 void mlx5e_tls_build_netdev(struct mlx5e_priv *priv)
diff --git a/drivers/net/ethernet/netronome/nfp/crypto/tls.c b/drivers/net/ethernet/netronome/nfp/crypto/tls.c
index 4427c1d42047..93f87b7633b1 100644
--- a/drivers/net/ethernet/netronome/nfp/crypto/tls.c
+++ b/drivers/net/ethernet/netronome/nfp/crypto/tls.c
@@ -383,14 +383,17 @@ nfp_net_tls_del(struct net_device *netdev, struct tls_context *tls_ctx,
 }
 
 static void
-nfp_net_tls_resync_rx(struct net_device *netdev, struct sock *sk, u32 seq,
-		      u8 *rcd_sn)
+nfp_net_tls_resync(struct net_device *netdev, struct sock *sk, u32 seq,
+		   u8 *rcd_sn, enum tls_offload_ctx_dir direction)
 {
 	struct nfp_net *nn = netdev_priv(netdev);
 	struct nfp_net_tls_offload_ctx *ntls;
 	struct nfp_crypto_req_update *req;
 	struct sk_buff *skb;
 
+	if (WARN_ON_ONCE(direction != TLS_OFFLOAD_CTX_DIR_RX))
+		return;
+
 	skb = nfp_net_tls_alloc_simple(nn, sizeof(*req), GFP_ATOMIC);
 	if (!skb)
 		return;
@@ -411,7 +414,7 @@ nfp_net_tls_resync_rx(struct net_device *netdev, struct sock *sk, u32 seq,
 static const struct tlsdev_ops nfp_net_tls_ops = {
 	.tls_dev_add = nfp_net_tls_add,
 	.tls_dev_del = nfp_net_tls_del,
-	.tls_dev_resync_rx = nfp_net_tls_resync_rx,
+	.tls_dev_resync = nfp_net_tls_resync,
 };
 
 static int nfp_net_tls_reset(struct nfp_net *nn)
diff --git a/include/net/tls.h b/include/net/tls.h
index 28eca6a3b615..9b49baecc4a8 100644
--- a/include/net/tls.h
+++ b/include/net/tls.h
@@ -299,8 +299,9 @@ struct tlsdev_ops {
 	void (*tls_dev_del)(struct net_device *netdev,
 			    struct tls_context *ctx,
 			    enum tls_offload_ctx_dir direction);
-	void (*tls_dev_resync_rx)(struct net_device *netdev,
-				  struct sock *sk, u32 seq, u8 *rcd_sn);
+	void (*tls_dev_resync)(struct net_device *netdev,
+			       struct sock *sk, u32 seq, u8 *rcd_sn,
+			       enum tls_offload_ctx_dir direction);
 };
 
 enum tls_offload_sync_type {
diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
index 477c869c69c8..b35a3b902bfa 100644
--- a/net/tls/tls_device.c
+++ b/net/tls/tls_device.c
@@ -559,7 +559,8 @@ static void tls_device_resync_rx(struct tls_context *tls_ctx,
 		return;
 	netdev = READ_ONCE(tls_ctx->netdev);
 	if (netdev)
-		netdev->tlsdev_ops->tls_dev_resync_rx(netdev, sk, seq, rcd_sn);
+		netdev->tlsdev_ops->tls_dev_resync(netdev, sk, seq, rcd_sn,
+						   TLS_OFFLOAD_CTX_DIR_RX);
 	clear_bit_unlock(TLS_RX_SYNC_RUNNING, &tls_ctx->flags);
 }
 
@@ -1105,7 +1106,7 @@ static int tls_dev_event(struct notifier_block *this, unsigned long event,
 	case NETDEV_REGISTER:
 	case NETDEV_FEAT_CHANGE:
 		if ((dev->features & NETIF_F_HW_TLS_RX) &&
-		    !dev->tlsdev_ops->tls_dev_resync_rx)
+		    !dev->tlsdev_ops->tls_dev_resync)
 			return NOTIFY_BAD;
 
 		if  (dev->tlsdev_ops &&
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net-next 11/12] net/tls: add kernel-driven resync mechanism for TX
  2019-06-11  4:39 [PATCH net-next 00/12] tls: add support for kernel-driven resync and nfp RX offload Jakub Kicinski
                   ` (9 preceding siblings ...)
  2019-06-11  4:40 ` [PATCH net-next 10/12] net/tls: generalize the resync callback Jakub Kicinski
@ 2019-06-11  4:40 ` Jakub Kicinski
  2019-06-11  4:40 ` [PATCH net-next 12/12] nfp: tls: make use of kernel-driven TX resync Jakub Kicinski
  2019-06-11 19:22 ` [PATCH net-next 00/12] tls: add support for kernel-driven resync and nfp RX offload David Miller
  12 siblings, 0 replies; 14+ messages in thread
From: Jakub Kicinski @ 2019-06-11  4:40 UTC (permalink / raw)
  To: davem
  Cc: netdev, oss-drivers, alexei.starovoitov, davejwatson, borisp,
	Jakub Kicinski, Dirk van der Merwe

TLS offload drivers keep track of TCP seq numbers to make sure
the packets are fed into the HW in order.

When packets get dropped on the way through the stack, the driver
will get out of sync and have to use fallback encryption, but unless
TCP seq number is resynced it will never match the packets correctly
(or even worse - use incorrect record sequence number after TCP seq
wraps).

Existing drivers (mlx5) feed the entire record on every out-of-order
event, allowing FW/HW to always be in sync.

This patch adds an alternative, more akin to the RX resync.  When
driver sees a frame which is past its expected sequence number the
stream must have gotten out of order (if the sequence number is
smaller than expected its likely a retransmission which doesn't
require resync).  Driver will ask the stack to perform TX sync
before it submits the next full record, and fall back to software
crypto until stack has performed the sync.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
---
 Documentation/networking/tls-offload.rst | 35 +++++++++++++++++++++++-
 include/net/tls.h                        | 23 ++++++++++++++++
 net/tls/tls_device.c                     | 27 ++++++++++++++++++
 3 files changed, 84 insertions(+), 1 deletion(-)

diff --git a/Documentation/networking/tls-offload.rst b/Documentation/networking/tls-offload.rst
index d134d63307e7..048e5ca44824 100644
--- a/Documentation/networking/tls-offload.rst
+++ b/Documentation/networking/tls-offload.rst
@@ -206,7 +206,11 @@ TX
 
 Segments transmitted from an offloaded socket can get out of sync
 in similar ways to the receive side-retransmissions - local drops
-are possible, though network reorders are not.
+are possible, though network reorders are not. There are currently
+two mechanisms for dealing with out of order segments.
+
+Crypto state rebuilding
+~~~~~~~~~~~~~~~~~~~~~~~
 
 Whenever an out of order segment is transmitted the driver provides
 the device with enough information to perform cryptographic operations.
@@ -225,6 +229,35 @@ was just a retransmission. The former is simpler, and does not require
 retransmission detection therefore it is the recommended method until
 such time it is proven inefficient.
 
+Next record sync
+~~~~~~~~~~~~~~~~
+
+Whenever an out of order segment is detected the driver requests
+that the ``ktls`` software fallback code encrypt it. If the segment's
+sequence number is lower than expected the driver assumes retransmission
+and doesn't change device state. If the segment is in the future, it
+may imply a local drop, the driver asks the stack to sync the device
+to the next record state and falls back to software.
+
+Resync request is indicated with:
+
+.. code-block:: c
+
+  void tls_offload_tx_resync_request(struct sock *sk, u32 got_seq, u32 exp_seq)
+
+Until resync is complete driver should not access its expected TCP
+sequence number (as it will be updated from a different context).
+Following helper should be used to test if resync is complete:
+
+.. code-block:: c
+
+  bool tls_offload_tx_resync_pending(struct sock *sk)
+
+Next time ``ktls`` pushes a record it will first send its TCP sequence number
+and TLS record number to the driver. Stack will also make sure that
+the new record will start on a segment boundary (like it does when
+the connection is initially added).
+
 RX
 --
 
diff --git a/include/net/tls.h b/include/net/tls.h
index 9b49baecc4a8..63e473420b00 100644
--- a/include/net/tls.h
+++ b/include/net/tls.h
@@ -212,6 +212,11 @@ struct tls_offload_context_tx {
 
 enum tls_context_flags {
 	TLS_RX_SYNC_RUNNING = 0,
+	/* Unlike RX where resync is driven entirely by the core in TX only
+	 * the driver knows when things went out of sync, so we need the flag
+	 * to be atomic.
+	 */
+	TLS_TX_SYNC_SCHED = 1,
 };
 
 struct cipher_context {
@@ -619,6 +624,24 @@ tls_offload_rx_resync_set_type(struct sock *sk, enum tls_offload_sync_type type)
 	tls_offload_ctx_rx(tls_ctx)->resync_type = type;
 }
 
+static inline void tls_offload_tx_resync_request(struct sock *sk)
+{
+	struct tls_context *tls_ctx = tls_get_ctx(sk);
+
+	WARN_ON(test_and_set_bit(TLS_TX_SYNC_SCHED, &tls_ctx->flags));
+}
+
+/* Driver's seq tracking has to be disabled until resync succeeded */
+static inline bool tls_offload_tx_resync_pending(struct sock *sk)
+{
+	struct tls_context *tls_ctx = tls_get_ctx(sk);
+	bool ret;
+
+	ret = test_bit(TLS_TX_SYNC_SCHED, &tls_ctx->flags);
+	smp_mb__after_atomic();
+	return ret;
+}
+
 int tls_proccess_cmsg(struct sock *sk, struct msghdr *msg,
 		      unsigned char *record_type);
 void tls_register_device(struct tls_device *device);
diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
index b35a3b902bfa..40076f423dcb 100644
--- a/net/tls/tls_device.c
+++ b/net/tls/tls_device.c
@@ -209,6 +209,29 @@ void tls_device_free_resources_tx(struct sock *sk)
 	tls_free_partial_record(sk, tls_ctx);
 }
 
+static void tls_device_resync_tx(struct sock *sk, struct tls_context *tls_ctx,
+				 u32 seq)
+{
+	struct net_device *netdev;
+	struct sk_buff *skb;
+	u8 *rcd_sn;
+
+	skb = tcp_write_queue_tail(sk);
+	if (skb)
+		TCP_SKB_CB(skb)->eor = 1;
+
+	rcd_sn = tls_ctx->tx.rec_seq;
+
+	down_read(&device_offload_lock);
+	netdev = tls_ctx->netdev;
+	if (netdev)
+		netdev->tlsdev_ops->tls_dev_resync(netdev, sk, seq, rcd_sn,
+						   TLS_OFFLOAD_CTX_DIR_TX);
+	up_read(&device_offload_lock);
+
+	clear_bit_unlock(TLS_TX_SYNC_SCHED, &tls_ctx->flags);
+}
+
 static void tls_append_frag(struct tls_record_info *record,
 			    struct page_frag *pfrag,
 			    int size)
@@ -264,6 +287,10 @@ static int tls_push_record(struct sock *sk,
 	list_add_tail(&record->list, &offload_ctx->records_list);
 	spin_unlock_irq(&offload_ctx->lock);
 	offload_ctx->open_record = NULL;
+
+	if (test_bit(TLS_TX_SYNC_SCHED, &ctx->flags))
+		tls_device_resync_tx(sk, ctx, tp->write_seq);
+
 	tls_advance_record_sn(sk, prot, &ctx->tx);
 
 	for (i = 0; i < record->num_frags; i++) {
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net-next 12/12] nfp: tls: make use of kernel-driven TX resync
  2019-06-11  4:39 [PATCH net-next 00/12] tls: add support for kernel-driven resync and nfp RX offload Jakub Kicinski
                   ` (10 preceding siblings ...)
  2019-06-11  4:40 ` [PATCH net-next 11/12] net/tls: add kernel-driven resync mechanism for TX Jakub Kicinski
@ 2019-06-11  4:40 ` Jakub Kicinski
  2019-06-11 19:22 ` [PATCH net-next 00/12] tls: add support for kernel-driven resync and nfp RX offload David Miller
  12 siblings, 0 replies; 14+ messages in thread
From: Jakub Kicinski @ 2019-06-11  4:40 UTC (permalink / raw)
  To: davem
  Cc: netdev, oss-drivers, alexei.starovoitov, davejwatson, borisp,
	Jakub Kicinski, Dirk van der Merwe

When TCP stream gets out of sync (driver stops receiving skbs
with expected TCP sequence numbers) request a TX resync from
the kernel.

We try to distinguish retransmissions from missed transmissions
by comparing the sequence number to expected - if it's further
than the expected one - we probably missed packets.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
---
 .../ethernet/netronome/nfp/crypto/crypto.h    |  1 -
 .../net/ethernet/netronome/nfp/crypto/tls.c   | 21 ++++++++++++-------
 .../ethernet/netronome/nfp/nfp_net_common.c   |  8 ++++---
 3 files changed, 18 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/crypto/crypto.h b/drivers/net/ethernet/netronome/nfp/crypto/crypto.h
index 591924ad920c..60372ddf69f0 100644
--- a/drivers/net/ethernet/netronome/nfp/crypto/crypto.h
+++ b/drivers/net/ethernet/netronome/nfp/crypto/crypto.h
@@ -13,7 +13,6 @@ struct nfp_net_tls_offload_ctx {
 	 */
 
 	u32 next_seq;
-	bool out_of_sync;
 };
 
 #ifdef CONFIG_TLS_DEVICE
diff --git a/drivers/net/ethernet/netronome/nfp/crypto/tls.c b/drivers/net/ethernet/netronome/nfp/crypto/tls.c
index 93f87b7633b1..3ee829d69c04 100644
--- a/drivers/net/ethernet/netronome/nfp/crypto/tls.c
+++ b/drivers/net/ethernet/netronome/nfp/crypto/tls.c
@@ -390,25 +390,30 @@ nfp_net_tls_resync(struct net_device *netdev, struct sock *sk, u32 seq,
 	struct nfp_net_tls_offload_ctx *ntls;
 	struct nfp_crypto_req_update *req;
 	struct sk_buff *skb;
+	gfp_t flags;
 
-	if (WARN_ON_ONCE(direction != TLS_OFFLOAD_CTX_DIR_RX))
-		return;
-
-	skb = nfp_net_tls_alloc_simple(nn, sizeof(*req), GFP_ATOMIC);
+	flags = direction == TLS_OFFLOAD_CTX_DIR_TX ? GFP_KERNEL : GFP_ATOMIC;
+	skb = nfp_net_tls_alloc_simple(nn, sizeof(*req), flags);
 	if (!skb)
 		return;
 
-	ntls = tls_driver_ctx(sk, TLS_OFFLOAD_CTX_DIR_RX);
+	ntls = tls_driver_ctx(sk, direction);
 	req = (void *)skb->data;
 	req->ep_id = 0;
-	req->opcode = NFP_NET_CRYPTO_OP_TLS_1_2_AES_GCM_128_DEC;
+	req->opcode = nfp_tls_1_2_dir_to_opcode(direction);
 	memset(req->resv, 0, sizeof(req->resv));
 	memcpy(req->handle, ntls->fw_handle, sizeof(ntls->fw_handle));
 	req->tcp_seq = cpu_to_be32(seq);
 	memcpy(req->rec_no, rcd_sn, sizeof(req->rec_no));
 
-	nfp_ccm_mbox_post(nn, skb, NFP_CCM_TYPE_CRYPTO_UPDATE,
-			  sizeof(struct nfp_crypto_reply_simple));
+	if (direction == TLS_OFFLOAD_CTX_DIR_TX) {
+		nfp_net_tls_communicate_simple(nn, skb, "sync",
+					       NFP_CCM_TYPE_CRYPTO_UPDATE);
+		ntls->next_seq = seq;
+	} else {
+		nfp_ccm_mbox_post(nn, skb, NFP_CCM_TYPE_CRYPTO_UPDATE,
+				  sizeof(struct nfp_crypto_reply_simple));
+	}
 }
 
 static const struct tlsdev_ops nfp_net_tls_ops = {
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index c9c43abb2427..8e9568b15062 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -829,6 +829,7 @@ nfp_net_tls_tx(struct nfp_net_dp *dp, struct nfp_net_r_vector *r_vec,
 {
 	struct nfp_net_tls_offload_ctx *ntls;
 	struct sk_buff *nskb;
+	bool resync_pending;
 	u32 datalen, seq;
 
 	if (likely(!dp->ktls_tx))
@@ -839,7 +840,8 @@ nfp_net_tls_tx(struct nfp_net_dp *dp, struct nfp_net_r_vector *r_vec,
 	datalen = skb->len - (skb_transport_offset(skb) + tcp_hdrlen(skb));
 	seq = ntohl(tcp_hdr(skb)->seq);
 	ntls = tls_driver_ctx(skb->sk, TLS_OFFLOAD_CTX_DIR_TX);
-	if (unlikely(ntls->next_seq != seq || ntls->out_of_sync)) {
+	resync_pending = tls_offload_tx_resync_pending(skb->sk);
+	if (unlikely(resync_pending || ntls->next_seq != seq)) {
 		/* Pure ACK out of order already */
 		if (!datalen)
 			return skb;
@@ -869,8 +871,8 @@ nfp_net_tls_tx(struct nfp_net_dp *dp, struct nfp_net_r_vector *r_vec,
 		}
 
 		/* jump forward, a TX may have gotten lost, need to sync TX */
-		if (!ntls->out_of_sync && seq - ntls->next_seq < U32_MAX / 4)
-			ntls->out_of_sync = true;
+		if (!resync_pending && seq - ntls->next_seq < U32_MAX / 4)
+			tls_offload_tx_resync_request(nskb->sk);
 
 		*nr_frags = 0;
 		return nskb;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 00/12] tls: add support for kernel-driven resync and nfp RX offload
  2019-06-11  4:39 [PATCH net-next 00/12] tls: add support for kernel-driven resync and nfp RX offload Jakub Kicinski
                   ` (11 preceding siblings ...)
  2019-06-11  4:40 ` [PATCH net-next 12/12] nfp: tls: make use of kernel-driven TX resync Jakub Kicinski
@ 2019-06-11 19:22 ` David Miller
  12 siblings, 0 replies; 14+ messages in thread
From: David Miller @ 2019-06-11 19:22 UTC (permalink / raw)
  To: jakub.kicinski
  Cc: netdev, oss-drivers, alexei.starovoitov, davejwatson, borisp

From: Jakub Kicinski <jakub.kicinski@netronome.com>
Date: Mon, 10 Jun 2019 21:39:58 -0700

> This series adds TLS RX offload for NFP and completes the offload
> by providing resync strategies.  When TLS data stream looses segments
> or experiences reorder NIC can no longer perform in line offload.
> Resyncs provide information about placement of records in the
> stream so that offload can resume.
> 
> Existing TLS resync mechanisms are not a great fit for the NFP.
> In particular the TX resync is hard to implement for packet-centric
> NICs.  This patchset adds an ability to perform TX resync in a way
> similar to the way initial sync is done - by calling down to the
> driver when new record is created after driver indicated sync had
> been lost.
> 
> Similarly on the RX side, we try to wait for a gap in the stream
> and send record information for the next record.  This works very
> well for RPC workloads which are the primary focus at this time.

Series applied, thanks Jakub.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2019-06-11 19:22 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-11  4:39 [PATCH net-next 00/12] tls: add support for kernel-driven resync and nfp RX offload Jakub Kicinski
2019-06-11  4:39 ` [PATCH net-next 01/12] net/tls: simplify seq calculation in handle_device_resync() Jakub Kicinski
2019-06-11  4:40 ` [PATCH net-next 02/12] net/tls: pass record number as a byte array Jakub Kicinski
2019-06-11  4:40 ` [PATCH net-next 03/12] net/tls: rename handle_device_resync() Jakub Kicinski
2019-06-11  4:40 ` [PATCH net-next 04/12] net/tls: add kernel-driven TLS RX resync Jakub Kicinski
2019-06-11  4:40 ` [PATCH net-next 05/12] nfp: tls: set skb decrypted flag Jakub Kicinski
2019-06-11  4:40 ` [PATCH net-next 06/12] nfp: rename nfp_ccm_mbox_alloc() Jakub Kicinski
2019-06-11  4:40 ` [PATCH net-next 07/12] nfp: add async version of mailbox communication Jakub Kicinski
2019-06-11  4:40 ` [PATCH net-next 08/12] nfp: tls: implement RX TLS resync Jakub Kicinski
2019-06-11  4:40 ` [PATCH net-next 09/12] nfp: tls: enable TLS RX offload Jakub Kicinski
2019-06-11  4:40 ` [PATCH net-next 10/12] net/tls: generalize the resync callback Jakub Kicinski
2019-06-11  4:40 ` [PATCH net-next 11/12] net/tls: add kernel-driven resync mechanism for TX Jakub Kicinski
2019-06-11  4:40 ` [PATCH net-next 12/12] nfp: tls: make use of kernel-driven TX resync Jakub Kicinski
2019-06-11 19:22 ` [PATCH net-next 00/12] tls: add support for kernel-driven resync and nfp RX offload David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).