netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [net 0/7] cxgb4/ch_ktls: Fixes in nic tls code
@ 2020-10-22 10:10 Rohit Maheshwari
  2020-10-22 10:10 ` [net 1/7] cxgb4/ch_ktls: decrypted bit is not enough Rohit Maheshwari
                   ` (6 more replies)
  0 siblings, 7 replies; 10+ messages in thread
From: Rohit Maheshwari @ 2020-10-22 10:10 UTC (permalink / raw)
  To: kuba, netdev, davem; +Cc: secdev, Rohit Maheshwari

This series helps in fixing multiple nic ktls issues.
Series is broken into 7 patches.

Patch 1 avoids deciding tls packet based on decrypted bit. If its
a retransmit packet which has tls handshake and finish (for
encryption), decrypted bit won't be set there, and so we can't
rely on decrypted bit.

Patch 2 helps supporting linear skb. SKBs were assumed non-linear.
Corrected the length extraction.

Patch 3 fixes kernel panic happening due to creating new skb for
each record. As part of fix driver will use same skb to send out
one tls record (partial data) of the same SKB.

Patch 4 avoids sending extra data which will be used to make a
record 16 byte aligned. We don't need to retransmit those extra
few bytes.

Patch 5 handles the cases where retransmit packet has tls starting
exchanges which are prior to tls start marker.

Patch 6 handles the small packet case which has partial TAG bytes
only. HW can't handle those, hence using sw crypto for such pkts.

Patch 7 corrects the potential tcb update problem.

Rohit Maheshwari (7):
  cxgb4/ch_ktls: decrypted bit is not enough
  ch_ktls: Correction in finding correct length
  cxgb4/ch_ktls: creating skbs causes panic
  ch_ktls: Correction in middle record handling
  ch_ktls: packet handling prior to start marker
  ch_ktls/cxgb4: handle partial tag alone SKBs
  ch_ktls: tcb update fails sometimes

 drivers/net/ethernet/chelsio/cxgb4/cxgb4.h    |   3 +
 .../ethernet/chelsio/cxgb4/cxgb4_debugfs.c    |   2 +
 .../net/ethernet/chelsio/cxgb4/cxgb4_main.c   |   3 +
 .../net/ethernet/chelsio/cxgb4/cxgb4_uld.h    |   8 +
 drivers/net/ethernet/chelsio/cxgb4/sge.c      | 111 ++-
 .../chelsio/inline_crypto/ch_ktls/chcr_ktls.c | 784 +++++++++---------
 .../chelsio/inline_crypto/ch_ktls/chcr_ktls.h |   1 +
 7 files changed, 542 insertions(+), 370 deletions(-)

-- 
2.18.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [net 1/7] cxgb4/ch_ktls: decrypted bit is not enough
  2020-10-22 10:10 [net 0/7] cxgb4/ch_ktls: Fixes in nic tls code Rohit Maheshwari
@ 2020-10-22 10:10 ` Rohit Maheshwari
  2020-10-22 10:10 ` [net 2/7] ch_ktls: Correction in finding correct length Rohit Maheshwari
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Rohit Maheshwari @ 2020-10-22 10:10 UTC (permalink / raw)
  To: kuba, netdev, davem; +Cc: secdev, Rohit Maheshwari

If skb has retransmit data starting before start marker, e.g. ccs,
decrypted bit won't be set for that, and if it has some data to
encrypt, then it must be given to crypto ULD. So in place of
decrypted, check if socket is tls offloaded. Also, unless skb has
some data to encrypt, no need to give it for tls offload handling.

Fixes: 5a4b9fe7fece ("cxgb4/chcr: complete record tx handling")
Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c            | 3 +++
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h             | 7 +++++++
 drivers/net/ethernet/chelsio/cxgb4/sge.c                   | 3 ++-
 .../net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c | 4 ----
 4 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index a952fe198eb9..c555a9f962f8 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -1176,6 +1176,9 @@ static u16 cxgb_select_queue(struct net_device *dev, struct sk_buff *skb,
 		txq = netdev_pick_tx(dev, skb, sb_dev);
 		if (xfrm_offload(skb) || is_ptp_enabled(skb, dev) ||
 		    skb->encapsulation ||
+#if IS_ENABLED(CONFIG_CHELSIO_TLS_DEVICE)
+		    cxgb4_is_ktls_skb(skb) ||
+#endif
 		    (proto != IPPROTO_TCP && proto != IPPROTO_UDP))
 			txq = txq % pi->nqsets;
 
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h
index b169776ab484..65a8d4d4c6e5 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h
@@ -493,6 +493,13 @@ struct cxgb4_uld_info {
 #endif
 };
 
+#if IS_ENABLED(CONFIG_CHELSIO_TLS_DEVICE)
+static inline bool cxgb4_is_ktls_skb(struct sk_buff *skb)
+{
+	return skb->sk && tls_is_sk_tx_device_offloaded(skb->sk);
+}
+#endif
+
 void cxgb4_uld_enable(struct adapter *adap);
 void cxgb4_register_uld(enum cxgb4_uld type, const struct cxgb4_uld_info *p);
 int cxgb4_unregister_uld(enum cxgb4_uld type);
diff --git a/drivers/net/ethernet/chelsio/cxgb4/sge.c b/drivers/net/ethernet/chelsio/cxgb4/sge.c
index a9e9c7ae565d..01bd9c0dfe4e 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/sge.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/sge.c
@@ -1422,7 +1422,8 @@ static netdev_tx_t cxgb4_eth_xmit(struct sk_buff *skb, struct net_device *dev)
 #endif /* CHELSIO_IPSEC_INLINE */
 
 #if IS_ENABLED(CONFIG_CHELSIO_TLS_DEVICE)
-	if (skb->decrypted)
+	if (cxgb4_is_ktls_skb(skb) &&
+	    (skb->len - (skb_transport_offset(skb) + tcp_hdrlen(skb))))
 		return adap->uld[CXGB4_ULD_KTLS].tx_handler(skb, dev);
 #endif /* CHELSIO_TLS_DEVICE */
 
diff --git a/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c b/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c
index 5195f692f14d..43c723c72c61 100644
--- a/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c
+++ b/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c
@@ -1878,10 +1878,6 @@ static int chcr_ktls_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	mss = skb_is_gso(skb) ? skb_shinfo(skb)->gso_size : skb->data_len;
 
-	/* check if we haven't set it for ktls offload */
-	if (!skb->sk || !tls_is_sk_tx_device_offloaded(skb->sk))
-		goto out;
-
 	tls_ctx = tls_get_ctx(skb->sk);
 	if (unlikely(tls_ctx->netdev != dev))
 		goto out;
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [net 2/7] ch_ktls: Correction in finding correct length
  2020-10-22 10:10 [net 0/7] cxgb4/ch_ktls: Fixes in nic tls code Rohit Maheshwari
  2020-10-22 10:10 ` [net 1/7] cxgb4/ch_ktls: decrypted bit is not enough Rohit Maheshwari
@ 2020-10-22 10:10 ` Rohit Maheshwari
  2020-10-22 10:10 ` [net 3/7] cxgb4/ch_ktls: creating skbs causes panic Rohit Maheshwari
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Rohit Maheshwari @ 2020-10-22 10:10 UTC (permalink / raw)
  To: kuba, netdev, davem; +Cc: secdev, Rohit Maheshwari

There is a possibility of linear skbs coming in. Correcting
the length extraction logic.

Fixes: 5a4b9fe7fece ("cxgb4/chcr: complete record tx handling")
Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
---
 .../chelsio/inline_crypto/ch_ktls/chcr_ktls.c | 62 +++++++++++--------
 1 file changed, 37 insertions(+), 25 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c b/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c
index 43c723c72c61..c69c3901a707 100644
--- a/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c
+++ b/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c
@@ -943,22 +943,21 @@ chcr_ktls_check_tcp_options(struct tcphdr *tcp)
  * @tx_chan - channel number.
  * return: NETDEV_TX_OK/NETDEV_TX_BUSY.
  */
-static int
-chcr_ktls_write_tcp_options(struct chcr_ktls_info *tx_info, struct sk_buff *skb,
-			    struct sge_eth_txq *q, uint32_t tx_chan)
+static int chcr_ktls_write_tcp_options(struct chcr_ktls_info *tx_info,
+				       struct sk_buff *skb,
+				       struct sge_eth_txq *q, bool fin)
 {
+	u32 ctrl, iplen, maclen, wr_mid = 0, pktlen, ndesc, len16;
 	struct fw_eth_tx_pkt_wr *wr;
 	struct cpl_tx_pkt_core *cpl;
-	u32 ctrl, iplen, maclen;
 #if IS_ENABLED(CONFIG_IPV6)
 	struct ipv6hdr *ip6;
 #endif
-	unsigned int ndesc;
 	struct tcphdr *tcp;
-	int len16, pktlen;
+	u8 buf[150] = {0};
 	struct iphdr *ip;
 	int credits;
-	u8 buf[150];
+	u64 cntrl1;
 	void *pos;
 
 	iplen = skb_network_header_len(skb);
@@ -967,7 +966,7 @@ chcr_ktls_write_tcp_options(struct chcr_ktls_info *tx_info, struct sk_buff *skb,
 	/* packet length = eth hdr len + ip hdr len + tcp hdr len
 	 * (including options).
 	 */
-	pktlen = skb->len - skb->data_len;
+	pktlen = skb_transport_offset(skb) + tcp_hdrlen(skb);
 
 	ctrl = sizeof(*cpl) + pktlen;
 	len16 = DIV_ROUND_UP(sizeof(*wr) + ctrl, 16);
@@ -980,6 +979,10 @@ chcr_ktls_write_tcp_options(struct chcr_ktls_info *tx_info, struct sk_buff *skb,
 		return NETDEV_TX_BUSY;
 	}
 
+	if (unlikely(credits < ETHTXQ_STOP_THRES)) {
+		chcr_eth_txq_stop(q);
+		wr_mid |= FW_WR_EQUEQ_F | FW_WR_EQUIQ_F;
+	}
 	pos = &q->q.desc[q->q.pidx];
 	wr = pos;
 
@@ -987,42 +990,51 @@ chcr_ktls_write_tcp_options(struct chcr_ktls_info *tx_info, struct sk_buff *skb,
 	wr->op_immdlen = htonl(FW_WR_OP_V(FW_ETH_TX_PKT_WR) |
 			       FW_WR_IMMDLEN_V(ctrl));
 
-	wr->equiq_to_len16 = htonl(FW_WR_LEN16_V(len16));
+	wr->equiq_to_len16 = htonl(wr_mid | FW_WR_LEN16_V(len16));
 	wr->r3 = 0;
 
 	cpl = (void *)(wr + 1);
 
 	/* CPL header */
-	cpl->ctrl0 = htonl(TXPKT_OPCODE_V(CPL_TX_PKT) | TXPKT_INTF_V(tx_chan) |
+	cpl->ctrl0 = htonl(TXPKT_OPCODE_V(CPL_TX_PKT) |
+			   TXPKT_INTF_V(tx_info->tx_chan) |
 			   TXPKT_PF_V(tx_info->adap->pf));
 	cpl->pack = 0;
 	cpl->len = htons(pktlen);
-	/* checksum offload */
-	cpl->ctrl1 = 0;
-
-	pos = cpl + 1;
 
 	memcpy(buf, skb->data, pktlen);
 	if (tx_info->ip_family == AF_INET) {
 		/* we need to correct ip header len */
 		ip = (struct iphdr *)(buf + maclen);
 		ip->tot_len = htons(pktlen - maclen);
+		cntrl1 = TXPKT_CSUM_TYPE_V(TX_CSUM_TCPIP);
 #if IS_ENABLED(CONFIG_IPV6)
 	} else {
 		ip6 = (struct ipv6hdr *)(buf + maclen);
 		ip6->payload_len = htons(pktlen - maclen - iplen);
+		cntrl1 = TXPKT_CSUM_TYPE_V(TX_CSUM_TCPIP6);
 #endif
 	}
+
+	cntrl1 |= T6_TXPKT_ETHHDR_LEN_V(maclen - ETH_HLEN) |
+		  TXPKT_IPHDR_LEN_V(iplen);
+	/* checksum offload */
+	cpl->ctrl1 = cpu_to_be64(cntrl1);
+
+	pos = cpl + 1;
+
 	/* now take care of the tcp header, if fin is not set then clear push
 	 * bit as well, and if fin is set, it will be sent at the last so we
 	 * need to update the tcp sequence number as per the last packet.
 	 */
 	tcp = (struct tcphdr *)(buf + maclen + iplen);
 
-	if (!tcp->fin)
+	if (!fin) {
 		tcp->psh = 0;
-	else
+		tcp->fin = 0;
+	} else {
 		tcp->seq = htonl(tx_info->prev_seq);
+	}
 
 	chcr_copy_to_txd(buf, &q->q, pos, pktlen);
 
@@ -1860,6 +1872,7 @@ static int chcr_short_record_handler(struct chcr_ktls_info *tx_info,
 /* nic tls TX handler */
 static int chcr_ktls_xmit(struct sk_buff *skb, struct net_device *dev)
 {
+	u32 tls_end_offset, tcp_seq, skb_data_len, skb_offset;
 	struct ch_ktls_port_stats_debug *port_stats;
 	struct chcr_ktls_ofld_ctx_tx *tx_ctx;
 	struct ch_ktls_stats_debug *stats;
@@ -1867,7 +1880,6 @@ static int chcr_ktls_xmit(struct sk_buff *skb, struct net_device *dev)
 	int data_len, qidx, ret = 0, mss;
 	struct tls_record_info *record;
 	struct chcr_ktls_info *tx_info;
-	u32 tls_end_offset, tcp_seq;
 	struct tls_context *tls_ctx;
 	struct sk_buff *local_skb;
 	struct sge_eth_txq *q;
@@ -1875,8 +1887,11 @@ static int chcr_ktls_xmit(struct sk_buff *skb, struct net_device *dev)
 	unsigned long flags;
 
 	tcp_seq = ntohl(th->seq);
+	skb_offset = skb_transport_offset(skb) + tcp_hdrlen(skb);
+	skb_data_len = skb->len - skb_offset;
+	data_len = skb_data_len;
 
-	mss = skb_is_gso(skb) ? skb_shinfo(skb)->gso_size : skb->data_len;
+	mss = skb_is_gso(skb) ? skb_shinfo(skb)->gso_size : data_len;
 
 	tls_ctx = tls_get_ctx(skb->sk);
 	if (unlikely(tls_ctx->netdev != dev))
@@ -1905,8 +1920,7 @@ static int chcr_ktls_xmit(struct sk_buff *skb, struct net_device *dev)
 	cxgb4_reclaim_completed_tx(adap, &q->q, true);
 	/* if tcp options are set but finish is not send the options first */
 	if (!th->fin && chcr_ktls_check_tcp_options(th)) {
-		ret = chcr_ktls_write_tcp_options(tx_info, skb, q,
-						  tx_info->tx_chan);
+		ret = chcr_ktls_write_tcp_options(tx_info, skb, q, false);
 		if (ret)
 			return NETDEV_TX_BUSY;
 	}
@@ -1922,8 +1936,6 @@ static int chcr_ktls_xmit(struct sk_buff *skb, struct net_device *dev)
 	/* copy skb contents into local skb */
 	chcr_ktls_skb_copy(skb, local_skb);
 
-	/* go through the skb and send only one record at a time. */
-	data_len = skb->data_len;
 	/* TCP segments can be in received either complete or partial.
 	 * chcr_end_part_handler will handle cases if complete record or end
 	 * part of the record is received. Incase of partial end part of record,
@@ -2020,15 +2032,15 @@ static int chcr_ktls_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	} while (data_len > 0);
 
-	tx_info->prev_seq = ntohl(th->seq) + skb->data_len;
+	tx_info->prev_seq = ntohl(th->seq) + skb_data_len;
 	atomic64_inc(&port_stats->ktls_tx_encrypted_packets);
-	atomic64_add(skb->data_len, &port_stats->ktls_tx_encrypted_bytes);
+	atomic64_add(skb_data_len, &port_stats->ktls_tx_encrypted_bytes);
 
 	/* tcp finish is set, send a separate tcp msg including all the options
 	 * as well.
 	 */
 	if (th->fin)
-		chcr_ktls_write_tcp_options(tx_info, skb, q, tx_info->tx_chan);
+		chcr_ktls_write_tcp_options(tx_info, skb, q, true);
 
 out:
 	dev_kfree_skb_any(skb);
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [net 3/7] cxgb4/ch_ktls: creating skbs causes panic
  2020-10-22 10:10 [net 0/7] cxgb4/ch_ktls: Fixes in nic tls code Rohit Maheshwari
  2020-10-22 10:10 ` [net 1/7] cxgb4/ch_ktls: decrypted bit is not enough Rohit Maheshwari
  2020-10-22 10:10 ` [net 2/7] ch_ktls: Correction in finding correct length Rohit Maheshwari
@ 2020-10-22 10:10 ` Rohit Maheshwari
  2020-10-22 23:53   ` Jakub Kicinski
  2020-10-22 10:10 ` [net 4/7] ch_ktls: Correction in middle record handling Rohit Maheshwari
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 10+ messages in thread
From: Rohit Maheshwari @ 2020-10-22 10:10 UTC (permalink / raw)
  To: kuba, netdev, davem; +Cc: secdev, Rohit Maheshwari

Creating SKB per tls record and freeing the original one causes
panic. There will be race if connection reset is requested. By
freeing original skb, refcnt will be decremented and that means,
there is no pending record to send, and so tls_dev_del will be
requested in control path while SKB of related connection is in
queue.
 Better approach is to use same SKB to send one record (partial
data) at a time. We still have to create a new SKB when partial
last part of a record is requested.
 This fix introduces new API cxgb4_write_partial_sgl() to send
partial part of skb. Present cxgb4_write_sgl can only provide
feasibility to start from an offset which limits to header only
and it can write sgls for the whole skb len. But this new API
will help in both. It can start from any offset and can end
writing in middle of the skb.

Fixes: 429765a149f1 ("chcr: handle partial end part of a record"
Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4.h    |   3 +
 drivers/net/ethernet/chelsio/cxgb4/sge.c      | 108 ++++
 .../chelsio/inline_crypto/ch_ktls/chcr_ktls.c | 543 +++++++-----------
 3 files changed, 327 insertions(+), 327 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
index 3352dad6ca99..27308600da15 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
@@ -2124,6 +2124,9 @@ void cxgb4_inline_tx_skb(const struct sk_buff *skb, const struct sge_txq *q,
 void cxgb4_write_sgl(const struct sk_buff *skb, struct sge_txq *q,
 		     struct ulptx_sgl *sgl, u64 *end, unsigned int start,
 		     const dma_addr_t *addr);
+void cxgb4_write_partial_sgl(const struct sk_buff *skb, struct sge_txq *q,
+			     struct ulptx_sgl *sgl, u64 *end,
+			     const dma_addr_t *addr, u32 start, u32 send_len);
 void cxgb4_ring_tx_db(struct adapter *adap, struct sge_txq *q, int n);
 int t4_set_vlan_acl(struct adapter *adap, unsigned int mbox, unsigned int vf,
 		    u16 vlan);
diff --git a/drivers/net/ethernet/chelsio/cxgb4/sge.c b/drivers/net/ethernet/chelsio/cxgb4/sge.c
index 01bd9c0dfe4e..196652a114c5 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/sge.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/sge.c
@@ -890,6 +890,114 @@ void cxgb4_write_sgl(const struct sk_buff *skb, struct sge_txq *q,
 }
 EXPORT_SYMBOL(cxgb4_write_sgl);
 
+/*	cxgb4_write_partial_sgl - populate SGL for partial packet
+ *	@skb: the packet
+ *	@q: the Tx queue we are writing into
+ *	@sgl: starting location for writing the SGL
+ *	@end: points right after the end of the SGL
+ *	@addr: the list of bus addresses for the SGL elements
+ *	@start: start offset in the SKB where partial data starts
+ *	@len: length of data from @start to send out
+ *
+ *	This API will handle sending out partial data of a skb if required.
+ *	Unlike cxgb4_write_sgl, @start can be any offset into the skb data,
+ *	and @len will decide how much data after @start offset to send out.
+ */
+void cxgb4_write_partial_sgl(const struct sk_buff *skb, struct sge_txq *q,
+			     struct ulptx_sgl *sgl, u64 *end,
+			     const dma_addr_t *addr, u32 start, u32 len)
+{
+	struct ulptx_sge_pair buf[MAX_SKB_FRAGS / 2 + 1] = {0}, *to;
+	u32 frag_size, skb_linear_data_len = skb_headlen(skb);
+	struct skb_shared_info *si = skb_shinfo(skb);
+	u8 i = 0, frag_idx = 0, nfrags = 0;
+	skb_frag_t *frag;
+
+	/* Fill the first SGL either from linear data or from partial
+	 * frag based on @start.
+	 */
+	if (unlikely(start < skb_linear_data_len)) {
+		frag_size = min(len, skb_linear_data_len - start);
+		sgl->len0 = htonl(frag_size);
+		sgl->addr0 = cpu_to_be64(addr[0] + start);
+		len -= frag_size;
+		nfrags++;
+	} else {
+		start -= skb_linear_data_len;
+		frag = &si->frags[frag_idx];
+		frag_size = skb_frag_size(frag);
+		/* find the first frag */
+		while (start >= frag_size) {
+			start -= frag_size;
+			frag_idx++;
+			frag = &si->frags[frag_idx];
+			frag_size = skb_frag_size(frag);
+		}
+
+		frag_size = min(len, skb_frag_size(frag) - start);
+		sgl->len0 = cpu_to_be32(frag_size);
+		sgl->addr0 = cpu_to_be64(addr[frag_idx + 1] + start);
+		len -= frag_size;
+		nfrags++;
+		frag_idx++;
+	}
+
+	/* If the entire partial data fit in one SGL, then send it out
+	 * now.
+	 */
+	if (!len)
+		goto done;
+
+	/* Most of the complexity below deals with the possibility we hit the
+	 * end of the queue in the middle of writing the SGL.  For this case
+	 * only we create the SGL in a temporary buffer and then copy it.
+	 */
+	to = (u8 *)end > (u8 *)q->stat ? buf : sgl->sge;
+
+	/* If the skb couldn't fit in first SGL completely, fill the
+	 * rest of the frags in subsequent SGLs. Note that each SGL
+	 * pair can store 2 frags.
+	 */
+	while (len) {
+		frag_size = min(len, skb_frag_size(&si->frags[frag_idx]));
+		to->len[i & 1] = cpu_to_be32(frag_size);
+		to->addr[i & 1] = cpu_to_be64(addr[frag_idx + 1]);
+		if (i && (i & 1))
+			to++;
+		nfrags++;
+		frag_idx++;
+		i++;
+		len -= frag_size;
+	}
+
+	/* If we ended in an odd boundary, then set the second SGL's
+	 * length in the pair to 0.
+	 */
+	if (i & 1)
+		to->len[1] = cpu_to_be32(0);
+
+	/* Copy from temporary buffer to Tx ring, in case we hit the
+	 * end of the queue in the middle of writing the SGL.
+	 */
+	if (unlikely((u8 *)end > (u8 *)q->stat)) {
+		u32 part0 = (u8 *)q->stat - (u8 *)sgl->sge, part1;
+
+		if (likely(part0))
+			memcpy(sgl->sge, buf, part0);
+		part1 = (u8 *)end - (u8 *)q->stat;
+		memcpy(q->desc, (u8 *)buf + part0, part1);
+		end = (void *)q->desc + part1;
+	}
+
+	/* 0-pad to multiple of 16 */
+	if ((uintptr_t)end & 8)
+		*end = 0;
+done:
+	sgl->cmd_nsge = htonl(ULPTX_CMD_V(ULP_TX_SC_DSGL) |
+			ULPTX_NSGE_V(nfrags));
+}
+EXPORT_SYMBOL(cxgb4_write_partial_sgl);
+
 /* This function copies 64 byte coalesced work request to
  * memory mapped BAR2 space. For coalesced WR SGE fetches
  * data from the FIFO instead of from Host.
diff --git a/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c b/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c
index c69c3901a707..9fa8e40ca0bf 100644
--- a/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c
+++ b/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c
@@ -14,6 +14,50 @@
 static LIST_HEAD(uld_ctx_list);
 static DEFINE_MUTEX(dev_mutex);
 
+/* chcr_get_nfrags_to_send: get the remaining nfrags after start offset
+ * @skb: skb
+ * @start: start offset.
+ * @len: how much data to send after @start
+ */
+static int chcr_get_nfrags_to_send(struct sk_buff *skb, u32 start, u32 len)
+{
+	struct skb_shared_info *si = skb_shinfo(skb);
+	u32 frag_size, skb_linear_data_len = skb_headlen(skb);
+	u8 nfrags = 0, frag_idx = 0;
+	skb_frag_t *frag;
+
+	/* if its a linear skb then return 1 */
+	if (!skb_is_nonlinear(skb))
+		return 1;
+
+	if (unlikely(start < skb_linear_data_len)) {
+		frag_size = min(len, skb_linear_data_len - start);
+		start = 0;
+	} else {
+		start -= skb_linear_data_len;
+
+		frag = &si->frags[frag_idx];
+		frag_size = skb_frag_size(frag);
+		while (start >= frag_size) {
+			start -= frag_size;
+			frag_idx++;
+			frag = &si->frags[frag_idx];
+			frag_size = skb_frag_size(frag);
+		}
+		frag_size = min(len, skb_frag_size(frag) - start);
+	}
+	len -= frag_size;
+	nfrags++;
+
+	while (len) {
+		frag_size = min(len, skb_frag_size(&si->frags[frag_idx]));
+		len -= frag_size;
+		nfrags++;
+		frag_idx++;
+	}
+	return nfrags;
+}
+
 static int chcr_init_tcb_fields(struct chcr_ktls_info *tx_info);
 /*
  * chcr_ktls_save_keys: calculate and save crypto keys.
@@ -865,38 +909,6 @@ static int chcr_ktls_xmit_tcb_cpls(struct chcr_ktls_info *tx_info,
 	return 0;
 }
 
-/*
- * chcr_ktls_skb_copy
- * @nskb - new skb where the frags to be added.
- * @skb - old skb from which frags will be copied.
- */
-static void chcr_ktls_skb_copy(struct sk_buff *skb, struct sk_buff *nskb)
-{
-	int i;
-
-	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
-		skb_shinfo(nskb)->frags[i] = skb_shinfo(skb)->frags[i];
-		__skb_frag_ref(&skb_shinfo(nskb)->frags[i]);
-	}
-
-	skb_shinfo(nskb)->nr_frags = skb_shinfo(skb)->nr_frags;
-	nskb->len += skb->data_len;
-	nskb->data_len = skb->data_len;
-	nskb->truesize += skb->data_len;
-}
-
-/*
- * chcr_ktls_get_tx_flits
- * returns number of flits to be sent out, it includes key context length, WR
- * size and skb fragments.
- */
-static unsigned int
-chcr_ktls_get_tx_flits(const struct sk_buff *skb, unsigned int key_ctx_len)
-{
-	return chcr_sgl_len(skb_shinfo(skb)->nr_frags) +
-	       DIV_ROUND_UP(key_ctx_len + CHCR_KTLS_WR_SIZE, 8);
-}
-
 /*
  * chcr_ktls_check_tcp_options: To check if there is any TCP option availbale
  * other than timestamp.
@@ -940,7 +952,6 @@ chcr_ktls_check_tcp_options(struct tcphdr *tcp)
  * @tx_info - driver specific tls info.
  * @skb - skb contains partial record..
  * @q - TX queue.
- * @tx_chan - channel number.
  * return: NETDEV_TX_OK/NETDEV_TX_BUSY.
  */
 static int chcr_ktls_write_tcp_options(struct chcr_ktls_info *tx_info,
@@ -1043,71 +1054,6 @@ static int chcr_ktls_write_tcp_options(struct chcr_ktls_info *tx_info,
 	return 0;
 }
 
-/* chcr_ktls_skb_shift - Shifts request length paged data from skb to another.
- * @tgt- buffer into which tail data gets added
- * @skb- buffer from which the paged data comes from
- * @shiftlen- shift up to this many bytes
- */
-static int chcr_ktls_skb_shift(struct sk_buff *tgt, struct sk_buff *skb,
-			       int shiftlen)
-{
-	skb_frag_t *fragfrom, *fragto;
-	int from, to, todo;
-
-	WARN_ON(shiftlen > skb->data_len);
-
-	todo = shiftlen;
-	from = 0;
-	to = 0;
-	fragfrom = &skb_shinfo(skb)->frags[from];
-
-	while ((todo > 0) && (from < skb_shinfo(skb)->nr_frags)) {
-		fragfrom = &skb_shinfo(skb)->frags[from];
-		fragto = &skb_shinfo(tgt)->frags[to];
-
-		if (todo >= skb_frag_size(fragfrom)) {
-			*fragto = *fragfrom;
-			todo -= skb_frag_size(fragfrom);
-			from++;
-			to++;
-
-		} else {
-			__skb_frag_ref(fragfrom);
-			skb_frag_page_copy(fragto, fragfrom);
-			skb_frag_off_copy(fragto, fragfrom);
-			skb_frag_size_set(fragto, todo);
-
-			skb_frag_off_add(fragfrom, todo);
-			skb_frag_size_sub(fragfrom, todo);
-			todo = 0;
-
-			to++;
-			break;
-		}
-	}
-
-	/* Ready to "commit" this state change to tgt */
-	skb_shinfo(tgt)->nr_frags = to;
-
-	/* Reposition in the original skb */
-	to = 0;
-	while (from < skb_shinfo(skb)->nr_frags)
-		skb_shinfo(skb)->frags[to++] = skb_shinfo(skb)->frags[from++];
-
-	skb_shinfo(skb)->nr_frags = to;
-
-	WARN_ON(todo > 0 && !skb_shinfo(skb)->nr_frags);
-
-	skb->len -= shiftlen;
-	skb->data_len -= shiftlen;
-	skb->truesize -= shiftlen;
-	tgt->len += shiftlen;
-	tgt->data_len += shiftlen;
-	tgt->truesize += shiftlen;
-
-	return shiftlen;
-}
-
 /*
  * chcr_ktls_xmit_wr_complete: This sends out the complete record. If an skb
  * received has partial end part of the record, send out the complete record, so
@@ -1121,9 +1067,11 @@ static int chcr_ktls_skb_shift(struct sk_buff *tgt, struct sk_buff *skb,
  * return: NETDEV_TX_BUSY/NET_TX_OK.
  */
 static int chcr_ktls_xmit_wr_complete(struct sk_buff *skb,
+				      bool is_last_wr,
 				      struct chcr_ktls_info *tx_info,
 				      struct sge_eth_txq *q, u32 tcp_seq,
-				      bool tcp_push, u32 mss)
+				      bool tcp_push, u32 mss, u32 data_len,
+				      u32 skb_offset, u32 nfrags)
 {
 	u32 len16, wr_mid = 0, flits = 0, ndesc, cipher_start;
 	struct adapter *adap = tx_info->adap;
@@ -1138,7 +1086,8 @@ static int chcr_ktls_xmit_wr_complete(struct sk_buff *skb,
 	u64 *end;
 
 	/* get the number of flits required */
-	flits = chcr_ktls_get_tx_flits(skb, tx_info->key_ctx_len);
+	flits = chcr_sgl_len(nfrags) +
+		DIV_ROUND_UP(tx_info->key_ctx_len + CHCR_KTLS_WR_SIZE, 8);
 	/* number of descriptors */
 	ndesc = chcr_flits_to_desc(flits);
 	/* check if enough credits available */
@@ -1167,6 +1116,9 @@ static int chcr_ktls_xmit_wr_complete(struct sk_buff *skb,
 		return NETDEV_TX_BUSY;
 	}
 
+	if (!is_last_wr)
+		skb_get(skb);
+
 	pos = &q->q.desc[q->q.pidx];
 	end = (u64 *)pos + flits;
 	/* FW_ULPTX_WR */
@@ -1199,7 +1151,7 @@ static int chcr_ktls_xmit_wr_complete(struct sk_buff *skb,
 		      CPL_TX_SEC_PDU_CPLLEN_V(CHCR_CPL_TX_SEC_PDU_LEN_64BIT) |
 		      CPL_TX_SEC_PDU_PLACEHOLDER_V(1) |
 		      CPL_TX_SEC_PDU_IVINSRTOFST_V(TLS_HEADER_SIZE + 1));
-	cpl->pldlen = htonl(skb->data_len);
+	cpl->pldlen = htonl(data_len);
 
 	/* encryption should start after tls header size + iv size */
 	cipher_start = TLS_HEADER_SIZE + tx_info->iv_size + 1;
@@ -1241,13 +1193,12 @@ static int chcr_ktls_xmit_wr_complete(struct sk_buff *skb,
 	/* CPL_TX_DATA */
 	tx_data = (void *)pos;
 	OPCODE_TID(tx_data) = htonl(MK_OPCODE_TID(CPL_TX_DATA, tx_info->tid));
-	tx_data->len = htonl(TX_DATA_MSS_V(mss) | TX_LENGTH_V(skb->data_len));
+	tx_data->len = htonl(TX_DATA_MSS_V(mss) | TX_LENGTH_V(data_len));
 
 	tx_data->rsvd = htonl(tcp_seq);
 
-	tx_data->flags = htonl(TX_BYPASS_F);
-	if (tcp_push)
-		tx_data->flags |= htonl(TX_PUSH_F | TX_SHOVE_F);
+	tx_data->flags = htonl(TX_BYPASS_F |
+			       (tcp_push ? TX_PUSH_F | TX_SHOVE_F : 0));
 
 	/* check left again, it might go beyond queue limit */
 	pos = tx_data + 1;
@@ -1261,8 +1212,8 @@ static int chcr_ktls_xmit_wr_complete(struct sk_buff *skb,
 	}
 
 	/* send the complete packet except the header */
-	cxgb4_write_sgl(skb, &q->q, pos, end, skb->len - skb->data_len,
-			sgl_sdesc->addr);
+	cxgb4_write_partial_sgl(skb, &q->q, pos, end, sgl_sdesc->addr,
+				skb_offset, data_len);
 	sgl_sdesc->skb = skb;
 
 	chcr_txq_advance(&q->q, ndesc);
@@ -1294,11 +1245,11 @@ static int chcr_ktls_xmit_wr_short(struct sk_buff *skb,
 				   struct sge_eth_txq *q,
 				   u32 tcp_seq, bool tcp_push, u32 mss,
 				   u32 tls_rec_offset, u8 *prior_data,
-				   u32 prior_data_len)
+				   u32 prior_data_len, u32 data_len,
+				   u32 skb_offset)
 {
+	u32 len16, wr_mid = 0, cipher_start, nfrags, flits = 0, ndesc;
 	struct adapter *adap = tx_info->adap;
-	u32 len16, wr_mid = 0, cipher_start;
-	unsigned int flits = 0, ndesc;
 	int credits, left, last_desc;
 	struct tx_sw_desc *sgl_sdesc;
 	struct cpl_tx_data *tx_data;
@@ -1310,14 +1261,17 @@ static int chcr_ktls_xmit_wr_short(struct sk_buff *skb,
 	void *pos;
 	u64 *end;
 
-	/* get the number of flits required, it's a partial record so 2 flits
-	 * (AES_BLOCK_SIZE) will be added.
+	nfrags = chcr_get_nfrags_to_send(skb, skb_offset, data_len);
+	flits = chcr_sgl_len(nfrags) +
+		DIV_ROUND_UP(tx_info->key_ctx_len + CHCR_KTLS_WR_SIZE, 8);
+	/* it's a partial record so 2 flits (AES_BLOCK_SIZE) will also be
+	 * added.
 	 */
-	flits = chcr_ktls_get_tx_flits(skb, tx_info->key_ctx_len) + 2;
+	flits += 2;
 	/* get the correct 8 byte IV of this record */
 	iv_record = cpu_to_be64(tx_info->iv + tx_info->record_no);
 	/* If it's a middle record and not 16 byte aligned to run AES CTR, need
-	 * to make it 16 byte aligned. So atleadt 2 extra flits of immediate
+	 * to make it 16 byte aligned. So atleast 2 extra flits of immediate
 	 * data will be added.
 	 */
 	if (prior_data_len)
@@ -1385,7 +1339,7 @@ static int chcr_ktls_xmit_wr_short(struct sk_buff *skb,
 		htonl(CPL_TX_SEC_PDU_OPCODE_V(CPL_TX_SEC_PDU) |
 		      CPL_TX_SEC_PDU_CPLLEN_V(CHCR_CPL_TX_SEC_PDU_LEN_64BIT) |
 		      CPL_TX_SEC_PDU_IVINSRTOFST_V(1));
-	cpl->pldlen = htonl(skb->data_len + AES_BLOCK_LEN + prior_data_len);
+	cpl->pldlen = htonl(data_len + AES_BLOCK_LEN + prior_data_len);
 	cpl->aadstart_cipherstop_hi =
 		htonl(CPL_TX_SEC_PDU_CIPHERSTART_V(cipher_start));
 	cpl->cipherstop_lo_authinsert = 0;
@@ -1416,11 +1370,10 @@ static int chcr_ktls_xmit_wr_short(struct sk_buff *skb,
 	tx_data = (void *)pos;
 	OPCODE_TID(tx_data) = htonl(MK_OPCODE_TID(CPL_TX_DATA, tx_info->tid));
 	tx_data->len = htonl(TX_DATA_MSS_V(mss) |
-			TX_LENGTH_V(skb->data_len + prior_data_len));
+			     TX_LENGTH_V(data_len + prior_data_len));
 	tx_data->rsvd = htonl(tcp_seq);
-	tx_data->flags = htonl(TX_BYPASS_F);
-	if (tcp_push)
-		tx_data->flags |= htonl(TX_PUSH_F | TX_SHOVE_F);
+	tx_data->flags = htonl(TX_BYPASS_F |
+			       (tcp_push ? TX_PUSH_F | TX_SHOVE_F : 0));
 
 	/* check left again, it might go beyond queue limit */
 	pos = tx_data + 1;
@@ -1449,8 +1402,8 @@ static int chcr_ktls_xmit_wr_short(struct sk_buff *skb,
 	if (prior_data_len)
 		pos = chcr_copy_to_txd(prior_data, &q->q, pos, 16);
 	/* send the complete packet except the header */
-	cxgb4_write_sgl(skb, &q->q, pos, end, skb->len - skb->data_len,
-			sgl_sdesc->addr);
+	cxgb4_write_partial_sgl(skb, &q->q, pos, end, sgl_sdesc->addr,
+				skb_offset, data_len);
 	sgl_sdesc->skb = skb;
 
 	chcr_txq_advance(&q->q, ndesc);
@@ -1477,24 +1430,22 @@ static int chcr_ktls_xmit_wr_short(struct sk_buff *skb,
 static int chcr_ktls_tx_plaintxt(struct chcr_ktls_info *tx_info,
 				 struct sk_buff *skb, u32 tcp_seq, u32 mss,
 				 bool tcp_push, struct sge_eth_txq *q,
-				 u32 port_id, u8 *prior_data,
-				 u32 prior_data_len)
+				 u32 data_len, u32 skb_offset)
 {
+	u32 flits = 0, ndesc, nfrags, wr_mid = 0;
 	int credits, left, len16, last_desc;
-	unsigned int flits = 0, ndesc;
 	struct tx_sw_desc *sgl_sdesc;
 	struct cpl_tx_data *tx_data;
 	struct ulptx_idata *idata;
 	struct ulp_txpkt *ulptx;
 	struct fw_ulptx_wr *wr;
-	u32 wr_mid = 0;
 	void *pos;
 	u64 *end;
 
 	flits = DIV_ROUND_UP(CHCR_PLAIN_TX_DATA_LEN, 8);
-	flits += chcr_sgl_len(skb_shinfo(skb)->nr_frags);
-	if (prior_data_len)
-		flits += 2;
+	nfrags = chcr_get_nfrags_to_send(skb, skb_offset, data_len);
+
+	flits += chcr_sgl_len(nfrags);
 	/* WR will need len16 */
 	len16 = DIV_ROUND_UP(flits, 2);
 	/* check how many descriptors needed */
@@ -1534,20 +1485,18 @@ static int chcr_ktls_tx_plaintxt(struct chcr_ktls_info *tx_info,
 	/* ULP_TXPKT */
 	ulptx = (struct ulp_txpkt *)(wr + 1);
 	ulptx->cmd_dest = htonl(ULPTX_CMD_V(ULP_TX_PKT) |
-			ULP_TXPKT_DATAMODIFY_V(0) |
 			ULP_TXPKT_CHANNELID_V(tx_info->port_id) |
-			ULP_TXPKT_DEST_V(0) |
-			ULP_TXPKT_FID_V(q->q.cntxt_id) | ULP_TXPKT_RO_V(1));
+			ULP_TXPKT_FID_V(q->q.cntxt_id) |
+			ULP_TXPKT_RO_F);
 	ulptx->len = htonl(len16 - 1);
 	/* ULPTX_IDATA sub-command */
 	idata = (struct ulptx_idata *)(ulptx + 1);
 	idata->cmd_more = htonl(ULPTX_CMD_V(ULP_TX_SC_IMM) | ULP_TX_SC_MORE_F);
-	idata->len = htonl(sizeof(*tx_data) + prior_data_len);
+	idata->len = htonl(sizeof(*tx_data));
 	/* CPL_TX_DATA */
 	tx_data = (struct cpl_tx_data *)(idata + 1);
 	OPCODE_TID(tx_data) = htonl(MK_OPCODE_TID(CPL_TX_DATA, tx_info->tid));
-	tx_data->len = htonl(TX_DATA_MSS_V(mss) |
-			TX_LENGTH_V(skb->data_len + prior_data_len));
+	tx_data->len = htonl(TX_DATA_MSS_V(mss) | TX_LENGTH_V(data_len));
 	/* set tcp seq number */
 	tx_data->rsvd = htonl(tcp_seq);
 	tx_data->flags = htonl(TX_BYPASS_F);
@@ -1555,12 +1504,6 @@ static int chcr_ktls_tx_plaintxt(struct chcr_ktls_info *tx_info,
 		tx_data->flags |= htonl(TX_PUSH_F | TX_SHOVE_F);
 
 	pos = tx_data + 1;
-	/* apart from prior_data_len, we should set remaining part of 16 bytes
-	 * to be zero.
-	 */
-	if (prior_data_len)
-		pos = chcr_copy_to_txd(prior_data, &q->q, pos, 16);
-
 	/* check left again, it might go beyond queue limit */
 	left = (void *)q->q.stat - pos;
 
@@ -1571,8 +1514,8 @@ static int chcr_ktls_tx_plaintxt(struct chcr_ktls_info *tx_info,
 		end = pos + left;
 	}
 	/* send the complete packet including the header */
-	cxgb4_write_sgl(skb, &q->q, pos, end, skb->len - skb->data_len,
-			sgl_sdesc->addr);
+	cxgb4_write_partial_sgl(skb, &q->q, pos, end, sgl_sdesc->addr,
+				skb_offset, data_len);
 	sgl_sdesc->skb = skb;
 
 	chcr_txq_advance(&q->q, ndesc);
@@ -1584,9 +1527,11 @@ static int chcr_ktls_tx_plaintxt(struct chcr_ktls_info *tx_info,
  * chcr_ktls_copy_record_in_skb
  * @nskb - new skb where the frags to be added.
  * @record - specific record which has complete 16k record in frags.
+ * @skb - old skb, to copy socket and destructor details.
  */
 static void chcr_ktls_copy_record_in_skb(struct sk_buff *nskb,
-					 struct tls_record_info *record)
+					 struct tls_record_info *record,
+					 struct sk_buff *skb)
 {
 	int i = 0;
 
@@ -1597,57 +1542,12 @@ static void chcr_ktls_copy_record_in_skb(struct sk_buff *nskb,
 	}
 
 	skb_shinfo(nskb)->nr_frags = record->num_frags;
-	nskb->data_len = record->len;
+	nskb->data_len += record->len;
 	nskb->len += record->len;
 	nskb->truesize += record->len;
-}
-
-/*
- * chcr_ktls_update_snd_una:  Reset the SEND_UNA. It will be done to avoid
- * sending the same segment again. It will discard the segment which is before
- * the current tx max.
- * @tx_info - driver specific tls info.
- * @q - TX queue.
- * return: NET_TX_OK/NET_XMIT_DROP.
- */
-static int chcr_ktls_update_snd_una(struct chcr_ktls_info *tx_info,
-				    struct sge_eth_txq *q)
-{
-	struct fw_ulptx_wr *wr;
-	unsigned int ndesc;
-	int credits;
-	void *pos;
-	u32 len;
-
-	len = sizeof(*wr) + roundup(CHCR_SET_TCB_FIELD_LEN, 16);
-	ndesc = DIV_ROUND_UP(len, 64);
-
-	credits = chcr_txq_avail(&q->q) - ndesc;
-	if (unlikely(credits < 0)) {
-		chcr_eth_txq_stop(q);
-		return NETDEV_TX_BUSY;
-	}
-
-	pos = &q->q.desc[q->q.pidx];
-
-	wr = pos;
-	/* ULPTX wr */
-	wr->op_to_compl = htonl(FW_WR_OP_V(FW_ULPTX_WR));
-	wr->cookie = 0;
-	/* fill len in wr field */
-	wr->flowid_len16 = htonl(FW_WR_LEN16_V(DIV_ROUND_UP(len, 16)));
-
-	pos += sizeof(*wr);
-
-	pos = chcr_write_cpl_set_tcb_ulp(tx_info, q, tx_info->tid, pos,
-					 TCB_SND_UNA_RAW_W,
-					 TCB_SND_UNA_RAW_V(TCB_SND_UNA_RAW_M),
-					 TCB_SND_UNA_RAW_V(0), 0);
-
-	chcr_txq_advance(&q->q, ndesc);
-	cxgb4_ring_tx_db(tx_info->adap, &q->q, ndesc);
-
-	return 0;
+	nskb->sk = skb->sk;
+	nskb->destructor = skb->destructor;
+	refcount_add(nskb->truesize, &nskb->sk->sk_wmem_alloc);
 }
 
 /*
@@ -1672,42 +1572,60 @@ static int chcr_end_part_handler(struct chcr_ktls_info *tx_info,
 				 struct tls_record_info *record,
 				 u32 tcp_seq, int mss, bool tcp_push_no_fin,
 				 struct sge_eth_txq *q,
-				 u32 tls_end_offset, bool last_wr)
+				 u32 tls_end_offset, u32 skb_offset)
 {
 	struct sk_buff *nskb = NULL;
+	bool is_last_wr = false;
+	int ret;
+
+	if (skb_offset + tls_end_offset == skb->len)
+		is_last_wr = true;
+
 	/* check if it is a complete record */
 	if (tls_end_offset == record->len) {
 		nskb = skb;
 		atomic64_inc(&tx_info->adap->ch_ktls_stats.ktls_tx_complete_pkts);
 	} else {
-		dev_kfree_skb_any(skb);
-
+		/* TAG needs to be calculated so, need to send complete record,
+		 * free the original skb and send a new one.
+		 */
 		nskb = alloc_skb(0, GFP_KERNEL);
-		if (!nskb)
+		if (!nskb) {
+			dev_kfree_skb_any(skb);
 			return NETDEV_TX_BUSY;
+		}
+
 		/* copy complete record in skb */
-		chcr_ktls_copy_record_in_skb(nskb, record);
+		chcr_ktls_copy_record_in_skb(nskb, record, skb);
 		/* packet is being sent from the beginning, update the tcp_seq
 		 * accordingly.
 		 */
 		tcp_seq = tls_record_start_seq(record);
-		/* reset snd una, so the middle record won't send the already
-		 * sent part.
-		 */
-		if (chcr_ktls_update_snd_una(tx_info, q))
-			goto out;
+		/* reset skb offset */
+		skb_offset = 0;
+
+		if (is_last_wr)
+			dev_kfree_skb_any(skb);
+
+		is_last_wr = true;
+
 		atomic64_inc(&tx_info->adap->ch_ktls_stats.ktls_tx_end_pkts);
 	}
 
-	if (chcr_ktls_xmit_wr_complete(nskb, tx_info, q, tcp_seq,
-				       (last_wr && tcp_push_no_fin),
-				       mss)) {
+	ret = chcr_ktls_xmit_wr_complete(nskb, is_last_wr,
+					 tx_info, q, tcp_seq,
+					 (is_last_wr && tcp_push_no_fin),
+					 mss, record->len, skb_offset,
+					 record->num_frags);
+	if (ret)
 		goto out;
-	}
+
+	tx_info->prev_seq = record->end_seq;
+
 	return 0;
 out:
 	dev_kfree_skb_any(nskb);
-	return NETDEV_TX_BUSY;
+	return ret;
 }
 
 /*
@@ -1735,41 +1653,46 @@ static int chcr_short_record_handler(struct chcr_ktls_info *tx_info,
 				     struct sk_buff *skb,
 				     struct tls_record_info *record,
 				     u32 tcp_seq, int mss, bool tcp_push_no_fin,
-				     struct sge_eth_txq *q, u32 tls_end_offset)
+				     struct sge_eth_txq *q, u32 tls_end_offset,
+				     u32 data_len, u32 skb_offset)
 {
-	u32 tls_rec_offset = tcp_seq - tls_record_start_seq(record);
+	u32 tls_rec_offset = record->len - tls_end_offset;
 	u8 prior_data[16] = {0};
 	u32 prior_data_len = 0;
-	u32 data_len;
 
 	/* check if the skb is ending in middle of tag/HASH, its a big
 	 * trouble, send the packet before the HASH.
 	 */
-	int remaining_record = tls_end_offset - skb->data_len;
+	int remaining_record = tls_end_offset - data_len;
 
 	if (remaining_record > 0 &&
 	    remaining_record < TLS_CIPHER_AES_GCM_128_TAG_SIZE) {
-		int trimmed_len = skb->data_len -
-			(TLS_CIPHER_AES_GCM_128_TAG_SIZE - remaining_record);
-		struct sk_buff *tmp_skb = NULL;
-		/* don't process the pkt if it is only a partial tag */
-		if (skb->data_len < TLS_CIPHER_AES_GCM_128_TAG_SIZE)
+		int trimmed_len = 0;
+
+		if (tls_end_offset > TLS_CIPHER_AES_GCM_128_TAG_SIZE)
+			trimmed_len = data_len -
+				      (TLS_CIPHER_AES_GCM_128_TAG_SIZE -
+				       remaining_record);
+		if (!trimmed_len)
 			goto out;
 
-		WARN_ON(trimmed_len > skb->data_len);
+		WARN_ON(trimmed_len > data_len);
+
+		data_len = trimmed_len;
+		atomic64_inc(&tx_info->adap->ch_ktls_stats.ktls_tx_trimmed_pkts);
+	}
 
-		/* shift to those many bytes */
-		tmp_skb = alloc_skb(0, GFP_KERNEL);
-		if (unlikely(!tmp_skb))
+	/* check if it is only the header part. */
+	if (tls_rec_offset + data_len <= (TLS_HEADER_SIZE + tx_info->iv_size)) {
+		if (chcr_ktls_tx_plaintxt(tx_info, skb, tcp_seq, mss,
+					  tcp_push_no_fin, q, data_len,
+					  skb_offset))
 			goto out;
 
-		chcr_ktls_skb_shift(tmp_skb, skb, trimmed_len);
-		/* free the last trimmed portion */
-		dev_kfree_skb_any(skb);
-		skb = tmp_skb;
-		atomic64_inc(&tx_info->adap->ch_ktls_stats.ktls_tx_trimmed_pkts);
+		tx_info->prev_seq = tcp_seq + data_len;
+		return 0;
+
 	}
-	data_len = skb->data_len;
 	/* check if the middle record's start point is 16 byte aligned. CTR
 	 * needs 16 byte aligned start point to start encryption.
 	 */
@@ -1830,39 +1753,19 @@ static int chcr_short_record_handler(struct chcr_ktls_info *tx_info,
 			}
 			/* reset tcp_seq as per the prior_data_required len */
 			tcp_seq -= prior_data_len;
-			/* include prio_data_len for  further calculation.
-			 */
-			data_len += prior_data_len;
 		}
-		/* reset snd una, so the middle record won't send the already
-		 * sent part.
-		 */
-		if (chcr_ktls_update_snd_una(tx_info, q))
-			goto out;
 		atomic64_inc(&tx_info->adap->ch_ktls_stats.ktls_tx_middle_pkts);
 	} else {
-		/* Else means, its a partial first part of the record. Check if
-		 * its only the header, don't need to send for encryption then.
-		 */
-		if (data_len <= TLS_HEADER_SIZE + tx_info->iv_size) {
-			if (chcr_ktls_tx_plaintxt(tx_info, skb, tcp_seq, mss,
-						  tcp_push_no_fin, q,
-						  tx_info->port_id,
-						  prior_data,
-						  prior_data_len)) {
-				goto out;
-			}
-			return 0;
-		}
 		atomic64_inc(&tx_info->adap->ch_ktls_stats.ktls_tx_start_pkts);
 	}
 
 	if (chcr_ktls_xmit_wr_short(skb, tx_info, q, tcp_seq, tcp_push_no_fin,
 				    mss, tls_rec_offset, prior_data,
-				    prior_data_len)) {
+				    prior_data_len, data_len, skb_offset)) {
 		goto out;
 	}
 
+	tx_info->prev_seq = tcp_seq + data_len + prior_data_len;
 	return 0;
 out:
 	dev_kfree_skb_any(skb);
@@ -1881,7 +1784,6 @@ static int chcr_ktls_xmit(struct sk_buff *skb, struct net_device *dev)
 	struct tls_record_info *record;
 	struct chcr_ktls_info *tx_info;
 	struct tls_context *tls_ctx;
-	struct sk_buff *local_skb;
 	struct sge_eth_txq *q;
 	struct adapter *adap;
 	unsigned long flags;
@@ -1903,51 +1805,23 @@ static int chcr_ktls_xmit(struct sk_buff *skb, struct net_device *dev)
 	if (unlikely(!tx_info))
 		goto out;
 
-	/* don't touch the original skb, make a new skb to extract each records
-	 * and send them separately.
-	 */
-	local_skb = alloc_skb(0, GFP_KERNEL);
-
-	if (unlikely(!local_skb))
-		return NETDEV_TX_BUSY;
-
 	adap = tx_info->adap;
 	stats = &adap->ch_ktls_stats;
 	port_stats = &stats->ktls_port[tx_info->port_id];
 
 	qidx = skb->queue_mapping;
 	q = &adap->sge.ethtxq[qidx + tx_info->first_qset];
-	cxgb4_reclaim_completed_tx(adap, &q->q, true);
-	/* if tcp options are set but finish is not send the options first */
-	if (!th->fin && chcr_ktls_check_tcp_options(th)) {
-		ret = chcr_ktls_write_tcp_options(tx_info, skb, q, false);
-		if (ret)
-			return NETDEV_TX_BUSY;
-	}
-	/* update tcb */
-	ret = chcr_ktls_xmit_tcb_cpls(tx_info, q, ntohl(th->seq),
-				      ntohl(th->ack_seq),
-				      ntohs(th->window));
-	if (ret) {
-		dev_kfree_skb_any(local_skb);
-		return NETDEV_TX_BUSY;
-	}
 
-	/* copy skb contents into local skb */
-	chcr_ktls_skb_copy(skb, local_skb);
+	cxgb4_reclaim_completed_tx(adap, &q->q, true);
+	skb_tx_timestamp(skb);
 
-	/* TCP segments can be in received either complete or partial.
-	 * chcr_end_part_handler will handle cases if complete record or end
-	 * part of the record is received. Incase of partial end part of record,
-	 * we will send the complete record again.
+	/* TCP segments can be received either complete or partial. Incase of
+	 * partial end part of record, we will send the complete record again.
 	 */
+	/* lock taken */
+	spin_lock_irqsave(&tx_ctx->base.lock, flags);
 
 	do {
-		int i;
-
-		cxgb4_reclaim_completed_tx(adap, &q->q, true);
-		/* lock taken */
-		spin_lock_irqsave(&tx_ctx->base.lock, flags);
 		/* fetch the tls record */
 		record = tls_get_record(&tx_ctx->base, tcp_seq,
 					&tx_info->record_no);
@@ -1966,82 +1840,97 @@ static int chcr_ktls_xmit(struct sk_buff *skb, struct net_device *dev)
 			goto out;
 		}
 
-		/* increase page reference count of the record, so that there
-		 * won't be any chance of page free in middle if in case stack
-		 * receives ACK and try to delete the record.
-		 */
-		for (i = 0; i < record->num_frags; i++)
-			__skb_frag_ref(&record->frags[i]);
-		/* lock cleared */
-		spin_unlock_irqrestore(&tx_ctx->base.lock, flags);
-
 		tls_end_offset = record->end_seq - tcp_seq;
 
-		pr_debug("seq 0x%x, end_seq 0x%x prev_seq 0x%x, datalen 0x%x\n",
-			 tcp_seq, record->end_seq, tx_info->prev_seq, data_len);
-		/* if a tls record is finishing in this SKB */
-		if (tls_end_offset <= data_len) {
-			struct sk_buff *nskb = NULL;
+		pr_debug("seq %#x, start %#x end %#x prev %#x, datalen %d offset %d\n",
+			 tcp_seq, tls_record_start_seq(record), record->end_seq,
+			 tx_info->prev_seq, data_len, tls_end_offset);
 
-			if (tls_end_offset < data_len) {
-				nskb = alloc_skb(0, GFP_KERNEL);
-				if (unlikely(!nskb)) {
-					ret = -ENOMEM;
-					goto clear_ref;
+		/* update tcb for the skb */
+		if (skb_data_len == data_len) {
+			u32 tx_max = tcp_seq;
+
+			if (!tls_record_is_start_marker(record) &&
+			    tls_end_offset < TLS_CIPHER_AES_GCM_128_TAG_SIZE)
+				tx_max = record->end_seq -
+					 TLS_CIPHER_AES_GCM_128_TAG_SIZE;
+			/* if tcp options are set but finish is not send the
+			 * options first
+			 */
+			if (!th->fin && chcr_ktls_check_tcp_options(th)) {
+				ret = chcr_ktls_write_tcp_options(tx_info, skb,
+								  q, false);
+				if (ret) {
+					spin_unlock_irqrestore(&tx_ctx->base.lock,
+							       flags);
+					goto out;
 				}
+			}
 
-				chcr_ktls_skb_shift(nskb, local_skb,
-						    tls_end_offset);
-			} else {
-				/* its the only record in this skb, directly
-				 * point it.
-				 */
-				nskb = local_skb;
+			ret = chcr_ktls_xmit_tcb_cpls(tx_info, q, tx_max,
+						      ntohl(th->ack_seq),
+						      ntohs(th->window));
+			if (ret) {
+				spin_unlock_irqrestore(&tx_ctx->base.lock,
+						       flags);
+				goto out;
 			}
-			ret = chcr_end_part_handler(tx_info, nskb, record,
+
+			if (th->fin)
+				skb_get(skb);
+		}
+
+		/* if a tls record is finishing in this SKB */
+		if (tls_end_offset <= data_len) {
+			ret = chcr_end_part_handler(tx_info, skb, record,
 						    tcp_seq, mss,
 						    (!th->fin && th->psh), q,
 						    tls_end_offset,
-						    (nskb == local_skb));
-
-			if (ret && nskb != local_skb)
-				dev_kfree_skb_any(local_skb);
+						    skb_offset);
 
 			data_len -= tls_end_offset;
 			/* tcp_seq increment is required to handle next record.
 			 */
-			tcp_seq += tls_end_offset;
+			tcp_seq = record->end_seq;
+			skb_offset += tls_end_offset;
 		} else {
-			ret = chcr_short_record_handler(tx_info, local_skb,
+			ret = chcr_short_record_handler(tx_info, skb,
 							record, tcp_seq, mss,
 							(!th->fin && th->psh),
-							q, tls_end_offset);
+							q, tls_end_offset,
+							data_len,
+							skb_offset);
 			data_len = 0;
 		}
-clear_ref:
-		/* clear the frag ref count which increased locally before */
-		for (i = 0; i < record->num_frags; i++) {
-			/* clear the frag ref count */
-			__skb_frag_unref(&record->frags[i]);
-		}
+
 		/* if any failure, come out from the loop. */
-		if (ret)
-			goto out;
+		if (ret) {
+			spin_unlock_irqrestore(&tx_ctx->base.lock, flags);
+			/* clear the extra ref count taken if fin is set. */
+			if (th->fin)
+				dev_kfree_skb_any(skb);
+
+			return NETDEV_TX_OK;
+		}
+
 		/* length should never be less than 0 */
 		WARN_ON(data_len < 0);
 
 	} while (data_len > 0);
 
-	tx_info->prev_seq = ntohl(th->seq) + skb_data_len;
+	spin_unlock_irqrestore(&tx_ctx->base.lock, flags);
 	atomic64_inc(&port_stats->ktls_tx_encrypted_packets);
 	atomic64_add(skb_data_len, &port_stats->ktls_tx_encrypted_bytes);
 
 	/* tcp finish is set, send a separate tcp msg including all the options
 	 * as well.
 	 */
-	if (th->fin)
+	if (th->fin) {
 		chcr_ktls_write_tcp_options(tx_info, skb, q, true);
+		dev_kfree_skb_any(skb);
+	}
 
+	return NETDEV_TX_OK;
 out:
 	dev_kfree_skb_any(skb);
 	return NETDEV_TX_OK;
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [net 4/7] ch_ktls: Correction in middle record handling
  2020-10-22 10:10 [net 0/7] cxgb4/ch_ktls: Fixes in nic tls code Rohit Maheshwari
                   ` (2 preceding siblings ...)
  2020-10-22 10:10 ` [net 3/7] cxgb4/ch_ktls: creating skbs causes panic Rohit Maheshwari
@ 2020-10-22 10:10 ` Rohit Maheshwari
  2020-10-22 10:10 ` [net 5/7] ch_ktls: packet handling prior to start marker Rohit Maheshwari
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Rohit Maheshwari @ 2020-10-22 10:10 UTC (permalink / raw)
  To: kuba, netdev, davem; +Cc: secdev, Rohit Maheshwari

If a record starts in middle, reset TCB UNA so that
we could avoid sending out extra packet which is
needed to make it 16 byte aligned to start AES CTR.

Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
---
 .../chelsio/inline_crypto/ch_ktls/chcr_ktls.c | 21 +++++++++++++------
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c b/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c
index 9fa8e40ca0bf..ebbc9af9d551 100644
--- a/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c
+++ b/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c
@@ -827,11 +827,11 @@ static void *chcr_write_cpl_set_tcb_ulp(struct chcr_ktls_info *tx_info,
  */
 static int chcr_ktls_xmit_tcb_cpls(struct chcr_ktls_info *tx_info,
 				   struct sge_eth_txq *q, u64 tcp_seq,
-				   u64 tcp_ack, u64 tcp_win)
+				   u64 tcp_ack, u64 tcp_win, bool offset)
 {
 	bool first_wr = ((tx_info->prev_ack == 0) && (tx_info->prev_win == 0));
 	struct ch_ktls_port_stats_debug *port_stats;
-	u32 len, cpl = 0, ndesc, wr_len;
+	u32 len, cpl = 0, ndesc, wr_len, wr_mid = 0;
 	struct fw_ulptx_wr *wr;
 	int credits;
 	void *pos;
@@ -847,6 +847,11 @@ static int chcr_ktls_xmit_tcb_cpls(struct chcr_ktls_info *tx_info,
 		return NETDEV_TX_BUSY;
 	}
 
+	if (unlikely(credits < ETHTXQ_STOP_THRES)) {
+		chcr_eth_txq_stop(q);
+		wr_mid |= FW_WR_EQUEQ_F | FW_WR_EQUIQ_F;
+	}
+
 	pos = &q->q.desc[q->q.pidx];
 	/* make space for WR, we'll fill it later when we know all the cpls
 	 * being sent out and have complete length.
@@ -862,7 +867,7 @@ static int chcr_ktls_xmit_tcb_cpls(struct chcr_ktls_info *tx_info,
 		cpl++;
 	}
 	/* reset snd una if it's a re-transmit pkt */
-	if (tcp_seq != tx_info->prev_seq) {
+	if (tcp_seq != tx_info->prev_seq || offset) {
 		/* reset snd_una */
 		port_stats =
 			&tx_info->adap->ch_ktls_stats.ktls_port[tx_info->port_id];
@@ -871,7 +876,8 @@ static int chcr_ktls_xmit_tcb_cpls(struct chcr_ktls_info *tx_info,
 						 TCB_SND_UNA_RAW_V
 						 (TCB_SND_UNA_RAW_M),
 						 TCB_SND_UNA_RAW_V(0), 0);
-		atomic64_inc(&port_stats->ktls_tx_ooo);
+		if (tcp_seq != tx_info->prev_seq)
+			atomic64_inc(&port_stats->ktls_tx_ooo);
 		cpl++;
 	}
 	/* update ack */
@@ -900,7 +906,8 @@ static int chcr_ktls_xmit_tcb_cpls(struct chcr_ktls_info *tx_info,
 		wr->op_to_compl = htonl(FW_WR_OP_V(FW_ULPTX_WR));
 		wr->cookie = 0;
 		/* fill len in wr field */
-		wr->flowid_len16 = htonl(FW_WR_LEN16_V(DIV_ROUND_UP(len, 16)));
+		wr->flowid_len16 = htonl(wr_mid |
+					 FW_WR_LEN16_V(DIV_ROUND_UP(len, 16)));
 
 		ndesc = DIV_ROUND_UP(len, 64);
 		chcr_txq_advance(&q->q, ndesc);
@@ -1869,7 +1876,9 @@ static int chcr_ktls_xmit(struct sk_buff *skb, struct net_device *dev)
 
 			ret = chcr_ktls_xmit_tcb_cpls(tx_info, q, tx_max,
 						      ntohl(th->ack_seq),
-						      ntohs(th->window));
+						      ntohs(th->window),
+						      tls_end_offset !=
+						      record->len);
 			if (ret) {
 				spin_unlock_irqrestore(&tx_ctx->base.lock,
 						       flags);
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [net 5/7] ch_ktls: packet handling prior to start marker
  2020-10-22 10:10 [net 0/7] cxgb4/ch_ktls: Fixes in nic tls code Rohit Maheshwari
                   ` (3 preceding siblings ...)
  2020-10-22 10:10 ` [net 4/7] ch_ktls: Correction in middle record handling Rohit Maheshwari
@ 2020-10-22 10:10 ` Rohit Maheshwari
  2020-10-22 10:10 ` [net 6/7] ch_ktls/cxgb4: handle partial tag alone SKBs Rohit Maheshwari
  2020-10-22 10:10 ` [net 7/7] ch_ktls: tcb update fails sometimes Rohit Maheshwari
  6 siblings, 0 replies; 10+ messages in thread
From: Rohit Maheshwari @ 2020-10-22 10:10 UTC (permalink / raw)
  To: kuba, netdev, davem; +Cc: secdev, Rohit Maheshwari

There could be a case where ACK for tls exchanges prior to start
marker is missed out, and by the time tls is offloaded. This pkt
should not be discarded and handled carefully. It could be
plaintext alone or plaintext + finish as well.

Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
---
 .../chelsio/inline_crypto/ch_ktls/chcr_ktls.c | 36 +++++++++++++++----
 1 file changed, 30 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c b/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c
index ebbc9af9d551..9cb987607f3d 100644
--- a/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c
+++ b/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c
@@ -1841,12 +1841,6 @@ static int chcr_ktls_xmit(struct sk_buff *skb, struct net_device *dev)
 			goto out;
 		}
 
-		if (unlikely(tls_record_is_start_marker(record))) {
-			spin_unlock_irqrestore(&tx_ctx->base.lock, flags);
-			atomic64_inc(&port_stats->ktls_tx_skip_no_sync_data);
-			goto out;
-		}
-
 		tls_end_offset = record->end_seq - tcp_seq;
 
 		pr_debug("seq %#x, start %#x end %#x prev %#x, datalen %d offset %d\n",
@@ -1889,6 +1883,36 @@ static int chcr_ktls_xmit(struct sk_buff *skb, struct net_device *dev)
 				skb_get(skb);
 		}
 
+		if (unlikely(tls_record_is_start_marker(record))) {
+			atomic64_inc(&port_stats->ktls_tx_skip_no_sync_data);
+			/* If tls_end_offset < data_len, means there is some
+			 * data after start marker, which needs encryption, send
+			 * plaintext first and take skb refcount. else send out
+			 * complete pkt as plaintext.
+			 */
+			if (tls_end_offset < data_len)
+				skb_get(skb);
+			else
+				tls_end_offset = data_len;
+
+			ret = chcr_ktls_tx_plaintxt(tx_info, skb, tcp_seq, mss,
+						    (!th->fin && th->psh), q,
+						    tls_end_offset, skb_offset);
+			if (ret) {
+				/* free the refcount taken earlier */
+				if (tls_end_offset < data_len)
+					dev_kfree_skb_any(skb);
+				spin_unlock_irqrestore(&tx_ctx->base.lock,
+						       flags);
+				goto out;
+			}
+
+			data_len -= tls_end_offset;
+			tcp_seq = record->end_seq;
+			skb_offset += tls_end_offset;
+			continue;
+		}
+
 		/* if a tls record is finishing in this SKB */
 		if (tls_end_offset <= data_len) {
 			ret = chcr_end_part_handler(tx_info, skb, record,
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [net 6/7] ch_ktls/cxgb4: handle partial tag alone SKBs
  2020-10-22 10:10 [net 0/7] cxgb4/ch_ktls: Fixes in nic tls code Rohit Maheshwari
                   ` (4 preceding siblings ...)
  2020-10-22 10:10 ` [net 5/7] ch_ktls: packet handling prior to start marker Rohit Maheshwari
@ 2020-10-22 10:10 ` Rohit Maheshwari
  2020-10-22 23:52   ` Jakub Kicinski
  2020-10-22 10:10 ` [net 7/7] ch_ktls: tcb update fails sometimes Rohit Maheshwari
  6 siblings, 1 reply; 10+ messages in thread
From: Rohit Maheshwari @ 2020-10-22 10:10 UTC (permalink / raw)
  To: kuba, netdev, davem; +Cc: secdev, Rohit Maheshwari

If TCP congestion caused a very small packets which only has some
part fo the TAG, and that too is not till the end. HW can't handle
such case, so falling back to sw crypto in such cases.

Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
---
 .../ethernet/chelsio/cxgb4/cxgb4_debugfs.c    |   2 +
 .../net/ethernet/chelsio/cxgb4/cxgb4_uld.h    |   1 +
 .../chelsio/inline_crypto/ch_ktls/chcr_ktls.c | 114 +++++++++++++++++-
 .../chelsio/inline_crypto/ch_ktls/chcr_ktls.h |   1 +
 4 files changed, 117 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
index 0273f40b85f7..17410fe86626 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
@@ -3573,6 +3573,8 @@ static int chcr_stats_show(struct seq_file *seq, void *v)
 		   atomic64_read(&adap->ch_ktls_stats.ktls_tx_complete_pkts));
 	seq_printf(seq, "TX trim pkts :                    %20llu\n",
 		   atomic64_read(&adap->ch_ktls_stats.ktls_tx_trimmed_pkts));
+	seq_printf(seq, "TX sw fallback :                  %20llu\n",
+		   atomic64_read(&adap->ch_ktls_stats.ktls_tx_fallback));
 	while (i < MAX_NPORTS) {
 		ktls_port = &adap->ch_ktls_stats.ktls_port[i];
 		seq_printf(seq, "Port %d\n", i);
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h
index 65a8d4d4c6e5..c3e51da08877 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h
@@ -388,6 +388,7 @@ struct ch_ktls_stats_debug {
 	atomic64_t ktls_tx_retransmit_pkts;
 	atomic64_t ktls_tx_complete_pkts;
 	atomic64_t ktls_tx_trimmed_pkts;
+	atomic64_t ktls_tx_fallback;
 };
 #endif
 
diff --git a/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c b/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c
index 9cb987607f3d..e1c6f9489a3f 100644
--- a/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c
+++ b/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c
@@ -1530,6 +1530,88 @@ static int chcr_ktls_tx_plaintxt(struct chcr_ktls_info *tx_info,
 	return 0;
 }
 
+static int chcr_ktls_tunnel_pkt(struct chcr_ktls_info *tx_info,
+				struct sk_buff *skb,
+				struct sge_eth_txq *q)
+{
+	u32 ctrl, iplen, maclen, wr_mid = 0, len16;
+	struct tx_sw_desc *sgl_sdesc;
+	struct fw_eth_tx_pkt_wr *wr;
+	struct cpl_tx_pkt_core *cpl;
+	unsigned int flits, ndesc;
+	int credits, last_desc;
+	u64 cntrl1, *end;
+	void *pos;
+
+	ctrl = sizeof(*cpl);
+	flits = DIV_ROUND_UP(sizeof(*wr) + ctrl, 8);
+
+	flits += chcr_sgl_len(skb_shinfo(skb)->nr_frags + 1);
+	len16 = DIV_ROUND_UP(flits, 2);
+	/* check how many descriptors needed */
+	ndesc = DIV_ROUND_UP(flits, 8);
+
+	credits = chcr_txq_avail(&q->q) - ndesc;
+	if (unlikely(credits < 0)) {
+		chcr_eth_txq_stop(q);
+		return -ENOMEM;
+	}
+
+	if (unlikely(credits < ETHTXQ_STOP_THRES)) {
+		chcr_eth_txq_stop(q);
+		wr_mid |= FW_WR_EQUEQ_F | FW_WR_EQUIQ_F;
+	}
+
+	last_desc = q->q.pidx + ndesc - 1;
+	if (last_desc >= q->q.size)
+		last_desc -= q->q.size;
+	sgl_sdesc = &q->q.sdesc[last_desc];
+
+	if (unlikely(cxgb4_map_skb(tx_info->adap->pdev_dev, skb,
+				   sgl_sdesc->addr) < 0)) {
+		memset(sgl_sdesc->addr, 0, sizeof(sgl_sdesc->addr));
+		q->mapping_err++;
+		return -ENOMEM;
+	}
+
+	iplen = skb_network_header_len(skb);
+	maclen = skb_mac_header_len(skb);
+
+	pos = &q->q.desc[q->q.pidx];
+	end = (u64 *)pos + flits;
+	wr = pos;
+
+	/* Firmware work request header */
+	wr->op_immdlen = htonl(FW_WR_OP_V(FW_ETH_TX_PKT_WR) |
+			       FW_WR_IMMDLEN_V(ctrl));
+
+	wr->equiq_to_len16 = htonl(wr_mid | FW_WR_LEN16_V(len16));
+	wr->r3 = 0;
+
+	cpl = (void *)(wr + 1);
+
+	/* CPL header */
+	cpl->ctrl0 = htonl(TXPKT_OPCODE_V(CPL_TX_PKT) |
+			   TXPKT_INTF_V(tx_info->tx_chan) |
+			   TXPKT_PF_V(tx_info->adap->pf));
+	cpl->pack = 0;
+	cntrl1 = TXPKT_CSUM_TYPE_V(tx_info->ip_family == AF_INET ?
+				   TX_CSUM_TCPIP : TX_CSUM_TCPIP6);
+	cntrl1 |= T6_TXPKT_ETHHDR_LEN_V(maclen - ETH_HLEN) |
+		  TXPKT_IPHDR_LEN_V(iplen);
+	/* checksum offload */
+	cpl->ctrl1 = cpu_to_be64(cntrl1);
+	cpl->len = htons(skb->len);
+
+	pos = cpl + 1;
+
+	cxgb4_write_sgl(skb, &q->q, pos, end, 0, sgl_sdesc->addr);
+	sgl_sdesc->skb = skb;
+	chcr_txq_advance(&q->q, ndesc);
+	cxgb4_ring_tx_db(tx_info->adap, &q->q, ndesc);
+	return 0;
+}
+
 /*
  * chcr_ktls_copy_record_in_skb
  * @nskb - new skb where the frags to be added.
@@ -1681,7 +1763,7 @@ static int chcr_short_record_handler(struct chcr_ktls_info *tx_info,
 				      (TLS_CIPHER_AES_GCM_128_TAG_SIZE -
 				       remaining_record);
 		if (!trimmed_len)
-			goto out;
+			return FALLBACK;
 
 		WARN_ON(trimmed_len > data_len);
 
@@ -1779,6 +1861,33 @@ static int chcr_short_record_handler(struct chcr_ktls_info *tx_info,
 	return NETDEV_TX_BUSY;
 }
 
+int chcr_ktls_sw_fallback(struct sk_buff *skb, struct chcr_ktls_info *tx_info,
+			  struct sge_eth_txq *q)
+{
+	u32 data_len, skb_offset;
+	struct sk_buff *nskb;
+	struct tcphdr *th;
+
+	nskb = tls_encrypt_skb(skb);
+
+	if (!nskb)
+		return 0;
+
+	th = tcp_hdr(nskb);
+	skb_offset =  skb_transport_offset(nskb) + tcp_hdrlen(nskb);
+	data_len = nskb->len - skb_offset;
+	skb_tx_timestamp(nskb);
+
+	if (chcr_ktls_tunnel_pkt(tx_info, nskb, q))
+		goto out;
+
+	tx_info->prev_seq = ntohl(th->seq) + data_len;
+	atomic64_inc(&tx_info->adap->ch_ktls_stats.ktls_tx_fallback);
+	return 0;
+out:
+	dev_kfree_skb_any(nskb);
+	return 0;
+}
 /* nic tls TX handler */
 static int chcr_ktls_xmit(struct sk_buff *skb, struct net_device *dev)
 {
@@ -1943,6 +2052,9 @@ static int chcr_ktls_xmit(struct sk_buff *skb, struct net_device *dev)
 			if (th->fin)
 				dev_kfree_skb_any(skb);
 
+			if (ret == FALLBACK)
+				return chcr_ktls_sw_fallback(skb, tx_info, q);
+
 			return NETDEV_TX_OK;
 		}
 
diff --git a/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.h b/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.h
index c1651b1431a0..18b3b1f02415 100644
--- a/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.h
+++ b/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.h
@@ -26,6 +26,7 @@
 
 #define CHCR_KTLS_WR_SIZE	(CHCR_PLAIN_TX_DATA_LEN +\
 				 sizeof(struct cpl_tx_sec_pdu))
+#define FALLBACK		35
 
 enum ch_ktls_open_state {
 	CH_KTLS_OPEN_SUCCESS = 0,
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [net 7/7] ch_ktls: tcb update fails sometimes
  2020-10-22 10:10 [net 0/7] cxgb4/ch_ktls: Fixes in nic tls code Rohit Maheshwari
                   ` (5 preceding siblings ...)
  2020-10-22 10:10 ` [net 6/7] ch_ktls/cxgb4: handle partial tag alone SKBs Rohit Maheshwari
@ 2020-10-22 10:10 ` Rohit Maheshwari
  6 siblings, 0 replies; 10+ messages in thread
From: Rohit Maheshwari @ 2020-10-22 10:10 UTC (permalink / raw)
  To: kuba, netdev, davem; +Cc: secdev, Rohit Maheshwari

context id and port id should be filled while sending tcb update.

Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
---
 .../chelsio/inline_crypto/ch_ktls/chcr_ktls.c        | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c b/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c
index e1c6f9489a3f..c5ea2884a313 100644
--- a/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c
+++ b/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c
@@ -733,7 +733,8 @@ static int chcr_ktls_cpl_set_tcb_rpl(struct adapter *adap, unsigned char *input)
 }
 
 static void *__chcr_write_cpl_set_tcb_ulp(struct chcr_ktls_info *tx_info,
-					u32 tid, void *pos, u16 word, u64 mask,
+					u32 tid, void *pos, u16 word,
+					struct sge_eth_txq *q, u64 mask,
 					u64 val, u32 reply)
 {
 	struct cpl_set_tcb_field_core *cpl;
@@ -742,7 +743,10 @@ static void *__chcr_write_cpl_set_tcb_ulp(struct chcr_ktls_info *tx_info,
 
 	/* ULP_TXPKT */
 	txpkt = pos;
-	txpkt->cmd_dest = htonl(ULPTX_CMD_V(ULP_TX_PKT) | ULP_TXPKT_DEST_V(0));
+	txpkt->cmd_dest = htonl(ULPTX_CMD_V(ULP_TX_PKT) |
+				ULP_TXPKT_CHANNELID_V(tx_info->port_id) |
+				ULP_TXPKT_FID_V(q->q.cntxt_id) |
+				ULP_TXPKT_RO_F);
 	txpkt->len = htonl(DIV_ROUND_UP(CHCR_SET_TCB_FIELD_LEN, 16));
 
 	/* ULPTX_IDATA sub-command */
@@ -797,7 +801,7 @@ static void *chcr_write_cpl_set_tcb_ulp(struct chcr_ktls_info *tx_info,
 		} else {
 			u8 buf[48] = {0};
 
-			__chcr_write_cpl_set_tcb_ulp(tx_info, tid, buf, word,
+			__chcr_write_cpl_set_tcb_ulp(tx_info, tid, buf, word, q,
 						     mask, val, reply);
 
 			return chcr_copy_to_txd(buf, &q->q, pos,
@@ -805,7 +809,7 @@ static void *chcr_write_cpl_set_tcb_ulp(struct chcr_ktls_info *tx_info,
 		}
 	}
 
-	pos = __chcr_write_cpl_set_tcb_ulp(tx_info, tid, pos, word,
+	pos = __chcr_write_cpl_set_tcb_ulp(tx_info, tid, pos, word, q,
 					   mask, val, reply);
 
 	/* check again if we are at the end of the queue */
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [net 6/7] ch_ktls/cxgb4: handle partial tag alone SKBs
  2020-10-22 10:10 ` [net 6/7] ch_ktls/cxgb4: handle partial tag alone SKBs Rohit Maheshwari
@ 2020-10-22 23:52   ` Jakub Kicinski
  0 siblings, 0 replies; 10+ messages in thread
From: Jakub Kicinski @ 2020-10-22 23:52 UTC (permalink / raw)
  To: Rohit Maheshwari; +Cc: netdev, davem, secdev

On Thu, 22 Oct 2020 15:40:18 +0530 Rohit Maheshwari wrote:
> If TCP congestion caused a very small packets which only has some
> part fo the TAG, and that too is not till the end. HW can't handle
> such case, so falling back to sw crypto in such cases.
> 
> Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>

drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c: At top level:
drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c:1864:5: warning: no previous prototype for ‘chcr_ktls_sw_fallback’ [-Wmissing-prototypes]
 1864 | int chcr_ktls_sw_fallback(struct sk_buff *skb, struct chcr_ktls_info *tx_info,
      |     ^~~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c:1864:5: warning: symbol 'chcr_ktls_sw_fallback' was not declared. Should it be static?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [net 3/7] cxgb4/ch_ktls: creating skbs causes panic
  2020-10-22 10:10 ` [net 3/7] cxgb4/ch_ktls: creating skbs causes panic Rohit Maheshwari
@ 2020-10-22 23:53   ` Jakub Kicinski
  0 siblings, 0 replies; 10+ messages in thread
From: Jakub Kicinski @ 2020-10-22 23:53 UTC (permalink / raw)
  To: Rohit Maheshwari; +Cc: netdev, davem, secdev

On Thu, 22 Oct 2020 15:40:15 +0530 Rohit Maheshwari wrote:
> Creating SKB per tls record and freeing the original one causes
> panic. There will be race if connection reset is requested. By
> freeing original skb, refcnt will be decremented and that means,
> there is no pending record to send, and so tls_dev_del will be
> requested in control path while SKB of related connection is in
> queue.
>  Better approach is to use same SKB to send one record (partial
> data) at a time. We still have to create a new SKB when partial
> last part of a record is requested.
>  This fix introduces new API cxgb4_write_partial_sgl() to send
> partial part of skb. Present cxgb4_write_sgl can only provide
> feasibility to start from an offset which limits to header only
> and it can write sgls for the whole skb len. But this new API
> will help in both. It can start from any offset and can end
> writing in middle of the skb.
> 
> Fixes: 429765a149f1 ("chcr: handle partial end part of a record"
> Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>

Fixes tag: Fixes: 429765a149f1 ("chcr: handle partial end part of a record"
Has these problem(s):
	- Subject has leading but no trailing parentheses

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-10-22 23:53 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-22 10:10 [net 0/7] cxgb4/ch_ktls: Fixes in nic tls code Rohit Maheshwari
2020-10-22 10:10 ` [net 1/7] cxgb4/ch_ktls: decrypted bit is not enough Rohit Maheshwari
2020-10-22 10:10 ` [net 2/7] ch_ktls: Correction in finding correct length Rohit Maheshwari
2020-10-22 10:10 ` [net 3/7] cxgb4/ch_ktls: creating skbs causes panic Rohit Maheshwari
2020-10-22 23:53   ` Jakub Kicinski
2020-10-22 10:10 ` [net 4/7] ch_ktls: Correction in middle record handling Rohit Maheshwari
2020-10-22 10:10 ` [net 5/7] ch_ktls: packet handling prior to start marker Rohit Maheshwari
2020-10-22 10:10 ` [net 6/7] ch_ktls/cxgb4: handle partial tag alone SKBs Rohit Maheshwari
2020-10-22 23:52   ` Jakub Kicinski
2020-10-22 10:10 ` [net 7/7] ch_ktls: tcb update fails sometimes Rohit Maheshwari

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).