All of lore.kernel.org
 help / color / mirror / Atom feed
* [next-queue 00/10] ixgbe: Add ipsec offload
@ 2017-12-05  5:35 ` Shannon Nelson
  0 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-05  5:35 UTC (permalink / raw)
  To: intel-wired-lan, jeffrey.t.kirsher
  Cc: steffen.klassert, sowmini.varadhan, netdev

This is an implementation of the ipsec hardware offload feature for
the ixgbe driver and Intel's 10Gbe series NICs: x540, x550, 82599.
These patches apply to net-next v4.14 as well as Jeff Kirsher's next-queue
v4.15-rc1-206-ge47375b.

The ixgbe NICs support ipsec offload for 1024 Rx and 1024 Tx Security
Associations (SAs), using up to 128 inbound IP addresses, and using the
rfc4106(gcm(aes)) encryption.  This code does not yet support IPv6,
checksum offload, or TSO in conjunction with the ipsec offload - those
will be added in the future.

This code shows improvements in both packet throughput and CPU utilization.
For example, here are some quicky numbers that show the magnitude of the
performance gain on a single run of "iperf -c <dest>" with the ipsec
offload on both ends of a point-to-point connection:

	9.4 Gbps - normal case
	7.6 Gbps - ipsec with offload
	343 Mbps - ipsec no offload

To set up a similar test case, you first need to be sure you have a recent
version of iproute2 that supports the ipsec offload tag, probably something
from ip 4.12 or newer would be best.  I have a shell script that builds
up the appropriate commands for me, but here are the resulting commands
for all tcp traffic between 14.0.0.52 and 14.0.0.70:

For the left side (14.0.0.52):
  ip x p add dir out src 14.0.0.52/24 dst 14.0.0.70/24 proto tcp tmpl \
     proto esp src 14.0.0.52 dst 14.0.0.70 spi 0x07 mode transport reqid 0x07
  ip x p add dir in src 14.0.0.70/24 dst 14.0.0.52/24 proto tcp tmpl \
     proto esp dst 14.0.0.52 src 14.0.0.70 spi 0x07 mode transport reqid 0x07
  ip x s add proto esp src 14.0.0.52 dst 14.0.0.70 spi 0x07 mode transport \
     reqid 0x07 replay-window 32 \
     aead 'rfc4106(gcm(aes))' 0x44434241343332312423222114131211f4f3f2f1 128 \
     sel src 14.0.0.52/24 dst 14.0.0.70/24 proto tcp offload dev eth4 dir out
  ip x s add proto esp dst 14.0.0.52 src 14.0.0.70 spi 0x07 mode transport \
     reqid 0x07 replay-window 32 \
     aead 'rfc4106(gcm(aes))' 0x44434241343332312423222114131211f4f3f2f1 128 \
     sel src 14.0.0.70/24 dst 14.0.0.52/24 proto tcp offload dev eth4 dir in
 
For the right side (14.0.0.70):
  ip x p add dir out src 14.0.0.70/24 dst 14.0.0.52/24 proto tcp tmpl \
     proto esp src 14.0.0.70 dst 14.0.0.52 spi 0x07 mode transport reqid 0x07
  ip x p add dir in src 14.0.0.52/24 dst 14.0.0.70/24 proto tcp tmpl \
     proto esp dst 14.0.0.70 src 14.0.0.52 spi 0x07 mode transport reqid 0x07
  ip x s add proto esp src 14.0.0.70 dst 14.0.0.52 spi 0x07 mode transport \
     reqid 0x07 replay-window 32 \
     aead 'rfc4106(gcm(aes))' 0x44434241343332312423222114131211f4f3f2f1 128 \
     sel src 14.0.0.70/24 dst 14.0.0.52/24 proto tcp offload dev eth4 dir out
  ip x s add proto esp dst 14.0.0.70 src 14.0.0.52 spi 0x07 mode transport \
     reqid 0x07 replay-window 32 \
     aead 'rfc4106(gcm(aes))' 0x44434241343332312423222114131211f4f3f2f1 128 \
     sel src 14.0.0.52/24 dst 14.0.0.70/24 proto tcp offload dev eth4 dir in

In both cases, the command "ip x s flush ; ip x p flush" will clean
it all out and remove the offloads.

Lastly, thanks to Alex Duyck for his early comments.

Shannon Nelson (10):
  ixgbe: clean up ipsec defines
  ixgbe: add ipsec register access routines
  ixgbe: add ipsec engine start and stop routines
  ixgbe: add ipsec data structures
  ixgbe: implement ipsec add and remove of offloaded SA
  ixgbe: restore offloaded SAs after a reset
  ixgbe: process the Rx ipsec offload
  ixgbe: process the Tx ipsec offload
  ixgbe: ipsec offload stats
  ixgbe: register ipsec offload with the xfrm subsystem

 drivers/net/ethernet/intel/ixgbe/Makefile        |   1 +
 drivers/net/ethernet/intel/ixgbe/ixgbe.h         |  30 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c |  28 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c   | 900 +++++++++++++++++++++++
 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h   |  90 +++
 drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c     |   4 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c    |  53 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_type.h    |  22 +-
 8 files changed, 1090 insertions(+), 38 deletions(-)
 create mode 100644 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
 create mode 100644 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h

-- 
2.7.4

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 00/10] ixgbe: Add ipsec offload
@ 2017-12-05  5:35 ` Shannon Nelson
  0 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-05  5:35 UTC (permalink / raw)
  To: intel-wired-lan

This is an implementation of the ipsec hardware offload feature for
the ixgbe driver and Intel's 10Gbe series NICs: x540, x550, 82599.
These patches apply to net-next v4.14 as well as Jeff Kirsher's next-queue
v4.15-rc1-206-ge47375b.

The ixgbe NICs support ipsec offload for 1024 Rx and 1024 Tx Security
Associations (SAs), using up to 128 inbound IP addresses, and using the
rfc4106(gcm(aes)) encryption.  This code does not yet support IPv6,
checksum offload, or TSO in conjunction with the ipsec offload - those
will be added in the future.

This code shows improvements in both packet throughput and CPU utilization.
For example, here are some quicky numbers that show the magnitude of the
performance gain on a single run of "iperf -c <dest>" with the ipsec
offload on both ends of a point-to-point connection:

	9.4 Gbps - normal case
	7.6 Gbps - ipsec with offload
	343 Mbps - ipsec no offload

To set up a similar test case, you first need to be sure you have a recent
version of iproute2 that supports the ipsec offload tag, probably something
from ip 4.12 or newer would be best.  I have a shell script that builds
up the appropriate commands for me, but here are the resulting commands
for all tcp traffic between 14.0.0.52 and 14.0.0.70:

For the left side (14.0.0.52):
  ip x p add dir out src 14.0.0.52/24 dst 14.0.0.70/24 proto tcp tmpl \
     proto esp src 14.0.0.52 dst 14.0.0.70 spi 0x07 mode transport reqid 0x07
  ip x p add dir in src 14.0.0.70/24 dst 14.0.0.52/24 proto tcp tmpl \
     proto esp dst 14.0.0.52 src 14.0.0.70 spi 0x07 mode transport reqid 0x07
  ip x s add proto esp src 14.0.0.52 dst 14.0.0.70 spi 0x07 mode transport \
     reqid 0x07 replay-window 32 \
     aead 'rfc4106(gcm(aes))' 0x44434241343332312423222114131211f4f3f2f1 128 \
     sel src 14.0.0.52/24 dst 14.0.0.70/24 proto tcp offload dev eth4 dir out
  ip x s add proto esp dst 14.0.0.52 src 14.0.0.70 spi 0x07 mode transport \
     reqid 0x07 replay-window 32 \
     aead 'rfc4106(gcm(aes))' 0x44434241343332312423222114131211f4f3f2f1 128 \
     sel src 14.0.0.70/24 dst 14.0.0.52/24 proto tcp offload dev eth4 dir in
 
For the right side (14.0.0.70):
  ip x p add dir out src 14.0.0.70/24 dst 14.0.0.52/24 proto tcp tmpl \
     proto esp src 14.0.0.70 dst 14.0.0.52 spi 0x07 mode transport reqid 0x07
  ip x p add dir in src 14.0.0.52/24 dst 14.0.0.70/24 proto tcp tmpl \
     proto esp dst 14.0.0.70 src 14.0.0.52 spi 0x07 mode transport reqid 0x07
  ip x s add proto esp src 14.0.0.70 dst 14.0.0.52 spi 0x07 mode transport \
     reqid 0x07 replay-window 32 \
     aead 'rfc4106(gcm(aes))' 0x44434241343332312423222114131211f4f3f2f1 128 \
     sel src 14.0.0.70/24 dst 14.0.0.52/24 proto tcp offload dev eth4 dir out
  ip x s add proto esp dst 14.0.0.70 src 14.0.0.52 spi 0x07 mode transport \
     reqid 0x07 replay-window 32 \
     aead 'rfc4106(gcm(aes))' 0x44434241343332312423222114131211f4f3f2f1 128 \
     sel src 14.0.0.52/24 dst 14.0.0.70/24 proto tcp offload dev eth4 dir in

In both cases, the command "ip x s flush ; ip x p flush" will clean
it all out and remove the offloads.

Lastly, thanks to Alex Duyck for his early comments.

Shannon Nelson (10):
  ixgbe: clean up ipsec defines
  ixgbe: add ipsec register access routines
  ixgbe: add ipsec engine start and stop routines
  ixgbe: add ipsec data structures
  ixgbe: implement ipsec add and remove of offloaded SA
  ixgbe: restore offloaded SAs after a reset
  ixgbe: process the Rx ipsec offload
  ixgbe: process the Tx ipsec offload
  ixgbe: ipsec offload stats
  ixgbe: register ipsec offload with the xfrm subsystem

 drivers/net/ethernet/intel/ixgbe/Makefile        |   1 +
 drivers/net/ethernet/intel/ixgbe/ixgbe.h         |  30 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c |  28 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c   | 900 +++++++++++++++++++++++
 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h   |  90 +++
 drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c     |   4 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c    |  53 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_type.h    |  22 +-
 8 files changed, 1090 insertions(+), 38 deletions(-)
 create mode 100644 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
 create mode 100644 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h

-- 
2.7.4


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [next-queue 01/10] ixgbe: clean up ipsec defines
  2017-12-05  5:35 ` [Intel-wired-lan] " Shannon Nelson
@ 2017-12-05  5:35   ` Shannon Nelson
  -1 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-05  5:35 UTC (permalink / raw)
  To: intel-wired-lan, jeffrey.t.kirsher
  Cc: steffen.klassert, sowmini.varadhan, netdev

Clean up the ipsec/macsec descriptor bit definitions to match the rest
of the defines and file organization.  Also recognise the bit-definition
overlap in the error mask macro.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_type.h | 20 +++++++-------------
 1 file changed, 7 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
index ffa0ee5..3df0763 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
@@ -2321,11 +2321,6 @@ enum {
 #define IXGBE_TXD_CMD_VLE    0x40000000 /* Add VLAN tag */
 #define IXGBE_TXD_STAT_DD    0x00000001 /* Descriptor Done */
 
-#define IXGBE_RXDADV_IPSEC_STATUS_SECP                  0x00020000
-#define IXGBE_RXDADV_IPSEC_ERROR_INVALID_PROTOCOL       0x08000000
-#define IXGBE_RXDADV_IPSEC_ERROR_INVALID_LENGTH         0x10000000
-#define IXGBE_RXDADV_IPSEC_ERROR_AUTH_FAILED            0x18000000
-#define IXGBE_RXDADV_IPSEC_ERROR_BIT_MASK               0x18000000
 /* Multiple Transmit Queue Command Register */
 #define IXGBE_MTQC_RT_ENA       0x1 /* DCB Enable */
 #define IXGBE_MTQC_VT_ENA       0x2 /* VMDQ2 Enable */
@@ -2377,6 +2372,9 @@ enum {
 #define IXGBE_RXDADV_ERR_LE     0x02000000 /* Length Error */
 #define IXGBE_RXDADV_ERR_PE     0x08000000 /* Packet Error */
 #define IXGBE_RXDADV_ERR_OSE    0x10000000 /* Oversize Error */
+#define IXGBE_RXDADV_ERR_IPSEC_INV_PROTOCOL  0x08000000 /* overlap ERR_PE  */
+#define IXGBE_RXDADV_ERR_IPSEC_INV_LENGTH    0x10000000 /* overlap ERR_OSE */
+#define IXGBE_RXDADV_ERR_IPSEC_AUTH_FAILED   0x18000000
 #define IXGBE_RXDADV_ERR_USE    0x20000000 /* Undersize Error */
 #define IXGBE_RXDADV_ERR_TCPE   0x40000000 /* TCP/UDP Checksum Error */
 #define IXGBE_RXDADV_ERR_IPE    0x80000000 /* IP Checksum Error */
@@ -2398,6 +2396,7 @@ enum {
 #define IXGBE_RXDADV_STAT_FCSTAT_FCPRSP 0x00000020 /* 10: Recv. FCP_RSP */
 #define IXGBE_RXDADV_STAT_FCSTAT_DDP    0x00000030 /* 11: Ctxt w/ DDP */
 #define IXGBE_RXDADV_STAT_TS		0x00010000 /* IEEE 1588 Time Stamp */
+#define IXGBE_RXDADV_STAT_SECP          0x00020000 /* IPsec/MACsec pkt found */
 
 /* PSRTYPE bit definitions */
 #define IXGBE_PSRTYPE_TCPHDR    0x00000010
@@ -2464,13 +2463,6 @@ enum {
 #define IXGBE_RXDADV_PKTTYPE_ETQF_MASK  0x00000070 /* ETQF has 8 indices */
 #define IXGBE_RXDADV_PKTTYPE_ETQF_SHIFT 4          /* Right-shift 4 bits */
 
-/* Security Processing bit Indication */
-#define IXGBE_RXDADV_LNKSEC_STATUS_SECP         0x00020000
-#define IXGBE_RXDADV_LNKSEC_ERROR_NO_SA_MATCH   0x08000000
-#define IXGBE_RXDADV_LNKSEC_ERROR_REPLAY_ERROR  0x10000000
-#define IXGBE_RXDADV_LNKSEC_ERROR_BIT_MASK      0x18000000
-#define IXGBE_RXDADV_LNKSEC_ERROR_BAD_SIG       0x18000000
-
 /* Masks to determine if packets should be dropped due to frame errors */
 #define IXGBE_RXD_ERR_FRAME_ERR_MASK ( \
 				      IXGBE_RXD_ERR_CE | \
@@ -2484,6 +2476,8 @@ enum {
 				      IXGBE_RXDADV_ERR_LE | \
 				      IXGBE_RXDADV_ERR_PE | \
 				      IXGBE_RXDADV_ERR_OSE | \
+				      IXGBE_RXDADV_ERR_IPSEC_INV_PROTOCOL | \
+				      IXGBE_RXDADV_ERR_IPSEC_INV_LENGTH | \
 				      IXGBE_RXDADV_ERR_USE)
 
 /* Multicast bit mask */
@@ -2893,6 +2887,7 @@ struct ixgbe_adv_tx_context_desc {
 				 IXGBE_ADVTXD_POPTS_SHIFT)
 #define IXGBE_ADVTXD_POPTS_TXSM (IXGBE_TXD_POPTS_TXSM << \
 				 IXGBE_ADVTXD_POPTS_SHIFT)
+#define IXGBE_ADVTXD_POPTS_IPSEC     0x00000400 /* IPSec offload request */
 #define IXGBE_ADVTXD_POPTS_ISCO_1ST  0x00000000 /* 1st TSO of iSCSI PDU */
 #define IXGBE_ADVTXD_POPTS_ISCO_MDL  0x00000800 /* Middle TSO of iSCSI PDU */
 #define IXGBE_ADVTXD_POPTS_ISCO_LAST 0x00001000 /* Last TSO of iSCSI PDU */
@@ -2908,7 +2903,6 @@ struct ixgbe_adv_tx_context_desc {
 #define IXGBE_ADVTXD_TUCMD_L4T_SCTP  0x00001000  /* L4 Packet TYPE of SCTP */
 #define IXGBE_ADVTXD_TUCMD_L4T_RSV     0x00001800 /* RSV L4 Packet TYPE */
 #define IXGBE_ADVTXD_TUCMD_MKRREQ    0x00002000 /*Req requires Markers and CRC*/
-#define IXGBE_ADVTXD_POPTS_IPSEC      0x00000400 /* IPSec offload request */
 #define IXGBE_ADVTXD_TUCMD_IPSEC_TYPE_ESP 0x00002000 /* IPSec Type ESP */
 #define IXGBE_ADVTXD_TUCMD_IPSEC_ENCRYPT_EN 0x00004000/* ESP Encrypt Enable */
 #define IXGBE_ADVTXT_TUCMD_FCOE      0x00008000       /* FCoE Frame Type */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 01/10] ixgbe: clean up ipsec defines
@ 2017-12-05  5:35   ` Shannon Nelson
  0 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-05  5:35 UTC (permalink / raw)
  To: intel-wired-lan

Clean up the ipsec/macsec descriptor bit definitions to match the rest
of the defines and file organization.  Also recognise the bit-definition
overlap in the error mask macro.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_type.h | 20 +++++++-------------
 1 file changed, 7 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
index ffa0ee5..3df0763 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
@@ -2321,11 +2321,6 @@ enum {
 #define IXGBE_TXD_CMD_VLE    0x40000000 /* Add VLAN tag */
 #define IXGBE_TXD_STAT_DD    0x00000001 /* Descriptor Done */
 
-#define IXGBE_RXDADV_IPSEC_STATUS_SECP                  0x00020000
-#define IXGBE_RXDADV_IPSEC_ERROR_INVALID_PROTOCOL       0x08000000
-#define IXGBE_RXDADV_IPSEC_ERROR_INVALID_LENGTH         0x10000000
-#define IXGBE_RXDADV_IPSEC_ERROR_AUTH_FAILED            0x18000000
-#define IXGBE_RXDADV_IPSEC_ERROR_BIT_MASK               0x18000000
 /* Multiple Transmit Queue Command Register */
 #define IXGBE_MTQC_RT_ENA       0x1 /* DCB Enable */
 #define IXGBE_MTQC_VT_ENA       0x2 /* VMDQ2 Enable */
@@ -2377,6 +2372,9 @@ enum {
 #define IXGBE_RXDADV_ERR_LE     0x02000000 /* Length Error */
 #define IXGBE_RXDADV_ERR_PE     0x08000000 /* Packet Error */
 #define IXGBE_RXDADV_ERR_OSE    0x10000000 /* Oversize Error */
+#define IXGBE_RXDADV_ERR_IPSEC_INV_PROTOCOL  0x08000000 /* overlap ERR_PE  */
+#define IXGBE_RXDADV_ERR_IPSEC_INV_LENGTH    0x10000000 /* overlap ERR_OSE */
+#define IXGBE_RXDADV_ERR_IPSEC_AUTH_FAILED   0x18000000
 #define IXGBE_RXDADV_ERR_USE    0x20000000 /* Undersize Error */
 #define IXGBE_RXDADV_ERR_TCPE   0x40000000 /* TCP/UDP Checksum Error */
 #define IXGBE_RXDADV_ERR_IPE    0x80000000 /* IP Checksum Error */
@@ -2398,6 +2396,7 @@ enum {
 #define IXGBE_RXDADV_STAT_FCSTAT_FCPRSP 0x00000020 /* 10: Recv. FCP_RSP */
 #define IXGBE_RXDADV_STAT_FCSTAT_DDP    0x00000030 /* 11: Ctxt w/ DDP */
 #define IXGBE_RXDADV_STAT_TS		0x00010000 /* IEEE 1588 Time Stamp */
+#define IXGBE_RXDADV_STAT_SECP          0x00020000 /* IPsec/MACsec pkt found */
 
 /* PSRTYPE bit definitions */
 #define IXGBE_PSRTYPE_TCPHDR    0x00000010
@@ -2464,13 +2463,6 @@ enum {
 #define IXGBE_RXDADV_PKTTYPE_ETQF_MASK  0x00000070 /* ETQF has 8 indices */
 #define IXGBE_RXDADV_PKTTYPE_ETQF_SHIFT 4          /* Right-shift 4 bits */
 
-/* Security Processing bit Indication */
-#define IXGBE_RXDADV_LNKSEC_STATUS_SECP         0x00020000
-#define IXGBE_RXDADV_LNKSEC_ERROR_NO_SA_MATCH   0x08000000
-#define IXGBE_RXDADV_LNKSEC_ERROR_REPLAY_ERROR  0x10000000
-#define IXGBE_RXDADV_LNKSEC_ERROR_BIT_MASK      0x18000000
-#define IXGBE_RXDADV_LNKSEC_ERROR_BAD_SIG       0x18000000
-
 /* Masks to determine if packets should be dropped due to frame errors */
 #define IXGBE_RXD_ERR_FRAME_ERR_MASK ( \
 				      IXGBE_RXD_ERR_CE | \
@@ -2484,6 +2476,8 @@ enum {
 				      IXGBE_RXDADV_ERR_LE | \
 				      IXGBE_RXDADV_ERR_PE | \
 				      IXGBE_RXDADV_ERR_OSE | \
+				      IXGBE_RXDADV_ERR_IPSEC_INV_PROTOCOL | \
+				      IXGBE_RXDADV_ERR_IPSEC_INV_LENGTH | \
 				      IXGBE_RXDADV_ERR_USE)
 
 /* Multicast bit mask */
@@ -2893,6 +2887,7 @@ struct ixgbe_adv_tx_context_desc {
 				 IXGBE_ADVTXD_POPTS_SHIFT)
 #define IXGBE_ADVTXD_POPTS_TXSM (IXGBE_TXD_POPTS_TXSM << \
 				 IXGBE_ADVTXD_POPTS_SHIFT)
+#define IXGBE_ADVTXD_POPTS_IPSEC     0x00000400 /* IPSec offload request */
 #define IXGBE_ADVTXD_POPTS_ISCO_1ST  0x00000000 /* 1st TSO of iSCSI PDU */
 #define IXGBE_ADVTXD_POPTS_ISCO_MDL  0x00000800 /* Middle TSO of iSCSI PDU */
 #define IXGBE_ADVTXD_POPTS_ISCO_LAST 0x00001000 /* Last TSO of iSCSI PDU */
@@ -2908,7 +2903,6 @@ struct ixgbe_adv_tx_context_desc {
 #define IXGBE_ADVTXD_TUCMD_L4T_SCTP  0x00001000  /* L4 Packet TYPE of SCTP */
 #define IXGBE_ADVTXD_TUCMD_L4T_RSV     0x00001800 /* RSV L4 Packet TYPE */
 #define IXGBE_ADVTXD_TUCMD_MKRREQ    0x00002000 /*Req requires Markers and CRC*/
-#define IXGBE_ADVTXD_POPTS_IPSEC      0x00000400 /* IPSec offload request */
 #define IXGBE_ADVTXD_TUCMD_IPSEC_TYPE_ESP 0x00002000 /* IPSec Type ESP */
 #define IXGBE_ADVTXD_TUCMD_IPSEC_ENCRYPT_EN 0x00004000/* ESP Encrypt Enable */
 #define IXGBE_ADVTXT_TUCMD_FCOE      0x00008000       /* FCoE Frame Type */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [next-queue 02/10] ixgbe: add ipsec register access routines
  2017-12-05  5:35 ` [Intel-wired-lan] " Shannon Nelson
@ 2017-12-05  5:35   ` Shannon Nelson
  -1 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-05  5:35 UTC (permalink / raw)
  To: intel-wired-lan, jeffrey.t.kirsher
  Cc: steffen.klassert, sowmini.varadhan, netdev

Add a few routines to make access to the ipsec registers just a little
easier, and throw in the beginnings of an initialization.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
---
 drivers/net/ethernet/intel/ixgbe/Makefile      |   1 +
 drivers/net/ethernet/intel/ixgbe/ixgbe.h       |   6 +
 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 157 +++++++++++++++++++++++++
 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h |  50 ++++++++
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  |   1 +
 5 files changed, 215 insertions(+)
 create mode 100644 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
 create mode 100644 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h

diff --git a/drivers/net/ethernet/intel/ixgbe/Makefile b/drivers/net/ethernet/intel/ixgbe/Makefile
index 35e6fa6..8319465 100644
--- a/drivers/net/ethernet/intel/ixgbe/Makefile
+++ b/drivers/net/ethernet/intel/ixgbe/Makefile
@@ -42,3 +42,4 @@ ixgbe-$(CONFIG_IXGBE_DCB) +=  ixgbe_dcb.o ixgbe_dcb_82598.o \
 ixgbe-$(CONFIG_IXGBE_HWMON) += ixgbe_sysfs.o
 ixgbe-$(CONFIG_DEBUG_FS) += ixgbe_debugfs.o
 ixgbe-$(CONFIG_FCOE:m=y) += ixgbe_fcoe.o
+ixgbe-$(CONFIG_XFRM_OFFLOAD) += ixgbe_ipsec.o
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index dd55787..1e11462 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -52,6 +52,7 @@
 #ifdef CONFIG_IXGBE_DCA
 #include <linux/dca.h>
 #endif
+#include "ixgbe_ipsec.h"
 
 #include <net/busy_poll.h>
 
@@ -1001,4 +1002,9 @@ void ixgbe_store_key(struct ixgbe_adapter *adapter);
 void ixgbe_store_reta(struct ixgbe_adapter *adapter);
 s32 ixgbe_negotiate_fc(struct ixgbe_hw *hw, u32 adv_reg, u32 lp_reg,
 		       u32 adv_sym, u32 adv_asm, u32 lp_sym, u32 lp_asm);
+#ifdef CONFIG_XFRM_OFFLOAD
+void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter);
+#else
+static inline void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter) { };
+#endif /* CONFIG_XFRM_OFFLOAD */
 #endif /* _IXGBE_H_ */
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
new file mode 100644
index 0000000..14dd011
--- /dev/null
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
@@ -0,0 +1,157 @@
+/*******************************************************************************
+ *
+ * Intel 10 Gigabit PCI Express Linux driver
+ * Copyright(c) 2017 Oracle and/or its affiliates. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Contact Information:
+ * Linux NICS <linux.nics@intel.com>
+ * e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
+ * Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+ *
+ ******************************************************************************/
+
+#include "ixgbe.h"
+
+/**
+ * ixgbe_ipsec_set_tx_sa - set the Tx SA registers
+ * @hw: hw specific details
+ * @idx: register index to write
+ * @key: key byte array
+ * @salt: salt bytes
+ **/
+static void ixgbe_ipsec_set_tx_sa(struct ixgbe_hw *hw, u16 idx,
+				  u32 key[], u32 salt)
+{
+	u32 reg;
+	int i;
+
+	for (i = 0; i < 4; i++)
+		IXGBE_WRITE_REG(hw, IXGBE_IPSTXKEY(i), cpu_to_be32(key[3-i]));
+	IXGBE_WRITE_REG(hw, IXGBE_IPSTXSALT, cpu_to_be32(salt));
+	IXGBE_WRITE_FLUSH(hw);
+
+	reg = IXGBE_READ_REG(hw, IXGBE_IPSTXIDX);
+	reg &= IXGBE_RXTXIDX_IPS_EN;
+	reg |= idx << 3 | IXGBE_RXTXIDX_IDX_WRITE;
+	IXGBE_WRITE_REG(hw, IXGBE_IPSTXIDX, reg);
+	IXGBE_WRITE_FLUSH(hw);
+}
+
+/**
+ * ixgbe_ipsec_set_rx_item - set an Rx table item
+ * @hw: hw specific details
+ * @idx: register index to write
+ * @tbl: table selector
+ *
+ * Trigger the device to store into a particular Rx table the
+ * data that has already been loaded into the input register
+ **/
+static void ixgbe_ipsec_set_rx_item(struct ixgbe_hw *hw, u16 idx, u32 tbl)
+{
+	u32 reg;
+
+	reg = IXGBE_READ_REG(hw, IXGBE_IPSRXIDX);
+	reg &= IXGBE_RXTXIDX_IPS_EN;
+	reg |= tbl | idx << 3 | IXGBE_RXTXIDX_IDX_WRITE;
+	IXGBE_WRITE_REG(hw, IXGBE_IPSRXIDX, reg);
+	IXGBE_WRITE_FLUSH(hw);
+}
+
+/**
+ * ixgbe_ipsec_set_rx_sa - set up the register bits to save SA info
+ * @hw: hw specific details
+ * @idx: register index to write
+ * @spi: security parameter index
+ * @key: key byte array
+ * @salt: salt bytes
+ * @mode: rx decrypt control bits
+ * @ip_idx: index into IP table for related IP address
+ **/
+static void ixgbe_ipsec_set_rx_sa(struct ixgbe_hw *hw, u16 idx, __be32 spi,
+				  u32 key[], u32 salt, u32 mode, u32 ip_idx)
+{
+	int i;
+
+	/* store the SPI (in bigendian) and IPidx */
+	IXGBE_WRITE_REG(hw, IXGBE_IPSRXSPI, spi);
+	IXGBE_WRITE_REG(hw, IXGBE_IPSRXIPIDX, ip_idx);
+	IXGBE_WRITE_FLUSH(hw);
+
+	ixgbe_ipsec_set_rx_item(hw, idx, IXGBE_RXIDX_TBL_SPI);
+
+	/* store the key, salt, and mode */
+	for (i = 0; i < 4; i++)
+		IXGBE_WRITE_REG(hw, IXGBE_IPSRXKEY(i), cpu_to_be32(key[3-i]));
+	IXGBE_WRITE_REG(hw, IXGBE_IPSRXSALT, cpu_to_be32(salt));
+	IXGBE_WRITE_REG(hw, IXGBE_IPSRXMOD, mode);
+	IXGBE_WRITE_FLUSH(hw);
+
+	ixgbe_ipsec_set_rx_item(hw, idx, IXGBE_RXIDX_TBL_KEY);
+}
+
+/**
+ * ixgbe_ipsec_set_rx_ip - set up the register bits to save SA IP addr info
+ * @hw: hw specific details
+ * @idx: register index to write
+ * @addr: IP address byte array
+ **/
+static void ixgbe_ipsec_set_rx_ip(struct ixgbe_hw *hw, u16 idx, u32 addr[])
+{
+	int i;
+
+	/* store the ip address */
+	for (i = 0; i < 4; i++)
+		IXGBE_WRITE_REG(hw, IXGBE_IPSRXIPADDR(i), addr[i]);
+	IXGBE_WRITE_FLUSH(hw);
+
+	ixgbe_ipsec_set_rx_item(hw, idx, IXGBE_RXIDX_TBL_IP);
+}
+
+/**
+ * ixgbe_ipsec_clear_hw_tables - because some tables don't get cleared on reset
+ * @adapter: board private structure
+ **/
+void ixgbe_ipsec_clear_hw_tables(struct ixgbe_adapter *adapter)
+{
+	struct ixgbe_hw *hw = &adapter->hw;
+	u32 buf[4] = {0, 0, 0, 0};
+	u16 idx;
+
+	/* disable Rx and Tx SA lookup */
+	IXGBE_WRITE_REG(hw, IXGBE_IPSRXIDX, 0);
+	IXGBE_WRITE_REG(hw, IXGBE_IPSTXIDX, 0);
+
+	/* scrub the tables */
+	for (idx = 0; idx < IXGBE_IPSEC_MAX_SA_COUNT; idx++)
+		ixgbe_ipsec_set_tx_sa(hw, idx, buf, 0);
+
+	for (idx = 0; idx < IXGBE_IPSEC_MAX_SA_COUNT; idx++)
+		ixgbe_ipsec_set_rx_sa(hw, idx, 0, buf, 0, 0, 0);
+
+	for (idx = 0; idx < IXGBE_IPSEC_MAX_RX_IP_COUNT; idx++)
+		ixgbe_ipsec_set_rx_ip(hw, idx, buf);
+}
+
+/**
+ * ixgbe_init_ipsec_offload - initialize security registers for IPSec operation
+ * @adapter: board private structure
+ **/
+void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter)
+{
+	ixgbe_ipsec_clear_hw_tables(adapter);
+}
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
new file mode 100644
index 0000000..017b13f
--- /dev/null
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
@@ -0,0 +1,50 @@
+/*******************************************************************************
+
+  Intel 10 Gigabit PCI Express Linux driver
+  Copyright(c) 2017 Oracle and/or its affiliates. All rights reserved.
+
+  This program is free software; you can redistribute it and/or modify it
+  under the terms and conditions of the GNU General Public License,
+  version 2, as published by the Free Software Foundation.
+
+  This program is distributed in the hope it will be useful, but WITHOUT
+  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+  more details.
+
+  You should have received a copy of the GNU General Public License along with
+  this program.  If not, see <http://www.gnu.org/licenses/>.
+
+  The full GNU General Public License is included in this distribution in
+  the file called "COPYING".
+
+  Contact Information:
+  Linux NICS <linux.nics@intel.com>
+  e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
+  Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+
+*******************************************************************************/
+
+#ifndef _IXGBE_IPSEC_H_
+#define _IXGBE_IPSEC_H_
+
+#define IXGBE_IPSEC_MAX_SA_COUNT	1024
+#define IXGBE_IPSEC_MAX_RX_IP_COUNT	128
+#define IXGBE_IPSEC_BASE_RX_INDEX	IXGBE_IPSEC_MAX_SA_COUNT
+#define IXGBE_IPSEC_BASE_TX_INDEX	(IXGBE_IPSEC_MAX_SA_COUNT * 2)
+
+#define IXGBE_RXTXIDX_IPS_EN		0x00000001
+#define IXGBE_RXIDX_TBL_MASK		0x00000006
+#define IXGBE_RXIDX_TBL_IP		0x00000002
+#define IXGBE_RXIDX_TBL_SPI		0x00000004
+#define IXGBE_RXIDX_TBL_KEY		0x00000006
+#define IXGBE_RXTXIDX_IDX_MASK		0x00001ff8
+#define IXGBE_RXTXIDX_IDX_READ		0x40000000
+#define IXGBE_RXTXIDX_IDX_WRITE		0x80000000
+
+#define IXGBE_RXMOD_VALID		0x00000001
+#define IXGBE_RXMOD_PROTO_ESP		0x00000004
+#define IXGBE_RXMOD_DECRYPT		0x00000008
+#define IXGBE_RXMOD_IPV6		0x00000010
+
+#endif /* _IXGBE_IPSEC_H_ */
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 6d5f31e..51fb3cf 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -10327,6 +10327,7 @@ static int ixgbe_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 					 NETIF_F_FCOE_MTU;
 	}
 #endif /* IXGBE_FCOE */
+	ixgbe_init_ipsec_offload(adapter);
 
 	if (adapter->flags2 & IXGBE_FLAG2_RSC_CAPABLE)
 		netdev->hw_features |= NETIF_F_LRO;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 02/10] ixgbe: add ipsec register access routines
@ 2017-12-05  5:35   ` Shannon Nelson
  0 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-05  5:35 UTC (permalink / raw)
  To: intel-wired-lan

Add a few routines to make access to the ipsec registers just a little
easier, and throw in the beginnings of an initialization.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
---
 drivers/net/ethernet/intel/ixgbe/Makefile      |   1 +
 drivers/net/ethernet/intel/ixgbe/ixgbe.h       |   6 +
 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 157 +++++++++++++++++++++++++
 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h |  50 ++++++++
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  |   1 +
 5 files changed, 215 insertions(+)
 create mode 100644 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
 create mode 100644 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h

diff --git a/drivers/net/ethernet/intel/ixgbe/Makefile b/drivers/net/ethernet/intel/ixgbe/Makefile
index 35e6fa6..8319465 100644
--- a/drivers/net/ethernet/intel/ixgbe/Makefile
+++ b/drivers/net/ethernet/intel/ixgbe/Makefile
@@ -42,3 +42,4 @@ ixgbe-$(CONFIG_IXGBE_DCB) +=  ixgbe_dcb.o ixgbe_dcb_82598.o \
 ixgbe-$(CONFIG_IXGBE_HWMON) += ixgbe_sysfs.o
 ixgbe-$(CONFIG_DEBUG_FS) += ixgbe_debugfs.o
 ixgbe-$(CONFIG_FCOE:m=y) += ixgbe_fcoe.o
+ixgbe-$(CONFIG_XFRM_OFFLOAD) += ixgbe_ipsec.o
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index dd55787..1e11462 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -52,6 +52,7 @@
 #ifdef CONFIG_IXGBE_DCA
 #include <linux/dca.h>
 #endif
+#include "ixgbe_ipsec.h"
 
 #include <net/busy_poll.h>
 
@@ -1001,4 +1002,9 @@ void ixgbe_store_key(struct ixgbe_adapter *adapter);
 void ixgbe_store_reta(struct ixgbe_adapter *adapter);
 s32 ixgbe_negotiate_fc(struct ixgbe_hw *hw, u32 adv_reg, u32 lp_reg,
 		       u32 adv_sym, u32 adv_asm, u32 lp_sym, u32 lp_asm);
+#ifdef CONFIG_XFRM_OFFLOAD
+void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter);
+#else
+static inline void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter) { };
+#endif /* CONFIG_XFRM_OFFLOAD */
 #endif /* _IXGBE_H_ */
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
new file mode 100644
index 0000000..14dd011
--- /dev/null
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
@@ -0,0 +1,157 @@
+/*******************************************************************************
+ *
+ * Intel 10 Gigabit PCI Express Linux driver
+ * Copyright(c) 2017 Oracle and/or its affiliates. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Contact Information:
+ * Linux NICS <linux.nics@intel.com>
+ * e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
+ * Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+ *
+ ******************************************************************************/
+
+#include "ixgbe.h"
+
+/**
+ * ixgbe_ipsec_set_tx_sa - set the Tx SA registers
+ * @hw: hw specific details
+ * @idx: register index to write
+ * @key: key byte array
+ * @salt: salt bytes
+ **/
+static void ixgbe_ipsec_set_tx_sa(struct ixgbe_hw *hw, u16 idx,
+				  u32 key[], u32 salt)
+{
+	u32 reg;
+	int i;
+
+	for (i = 0; i < 4; i++)
+		IXGBE_WRITE_REG(hw, IXGBE_IPSTXKEY(i), cpu_to_be32(key[3-i]));
+	IXGBE_WRITE_REG(hw, IXGBE_IPSTXSALT, cpu_to_be32(salt));
+	IXGBE_WRITE_FLUSH(hw);
+
+	reg = IXGBE_READ_REG(hw, IXGBE_IPSTXIDX);
+	reg &= IXGBE_RXTXIDX_IPS_EN;
+	reg |= idx << 3 | IXGBE_RXTXIDX_IDX_WRITE;
+	IXGBE_WRITE_REG(hw, IXGBE_IPSTXIDX, reg);
+	IXGBE_WRITE_FLUSH(hw);
+}
+
+/**
+ * ixgbe_ipsec_set_rx_item - set an Rx table item
+ * @hw: hw specific details
+ * @idx: register index to write
+ * @tbl: table selector
+ *
+ * Trigger the device to store into a particular Rx table the
+ * data that has already been loaded into the input register
+ **/
+static void ixgbe_ipsec_set_rx_item(struct ixgbe_hw *hw, u16 idx, u32 tbl)
+{
+	u32 reg;
+
+	reg = IXGBE_READ_REG(hw, IXGBE_IPSRXIDX);
+	reg &= IXGBE_RXTXIDX_IPS_EN;
+	reg |= tbl | idx << 3 | IXGBE_RXTXIDX_IDX_WRITE;
+	IXGBE_WRITE_REG(hw, IXGBE_IPSRXIDX, reg);
+	IXGBE_WRITE_FLUSH(hw);
+}
+
+/**
+ * ixgbe_ipsec_set_rx_sa - set up the register bits to save SA info
+ * @hw: hw specific details
+ * @idx: register index to write
+ * @spi: security parameter index
+ * @key: key byte array
+ * @salt: salt bytes
+ * @mode: rx decrypt control bits
+ * @ip_idx: index into IP table for related IP address
+ **/
+static void ixgbe_ipsec_set_rx_sa(struct ixgbe_hw *hw, u16 idx, __be32 spi,
+				  u32 key[], u32 salt, u32 mode, u32 ip_idx)
+{
+	int i;
+
+	/* store the SPI (in bigendian) and IPidx */
+	IXGBE_WRITE_REG(hw, IXGBE_IPSRXSPI, spi);
+	IXGBE_WRITE_REG(hw, IXGBE_IPSRXIPIDX, ip_idx);
+	IXGBE_WRITE_FLUSH(hw);
+
+	ixgbe_ipsec_set_rx_item(hw, idx, IXGBE_RXIDX_TBL_SPI);
+
+	/* store the key, salt, and mode */
+	for (i = 0; i < 4; i++)
+		IXGBE_WRITE_REG(hw, IXGBE_IPSRXKEY(i), cpu_to_be32(key[3-i]));
+	IXGBE_WRITE_REG(hw, IXGBE_IPSRXSALT, cpu_to_be32(salt));
+	IXGBE_WRITE_REG(hw, IXGBE_IPSRXMOD, mode);
+	IXGBE_WRITE_FLUSH(hw);
+
+	ixgbe_ipsec_set_rx_item(hw, idx, IXGBE_RXIDX_TBL_KEY);
+}
+
+/**
+ * ixgbe_ipsec_set_rx_ip - set up the register bits to save SA IP addr info
+ * @hw: hw specific details
+ * @idx: register index to write
+ * @addr: IP address byte array
+ **/
+static void ixgbe_ipsec_set_rx_ip(struct ixgbe_hw *hw, u16 idx, u32 addr[])
+{
+	int i;
+
+	/* store the ip address */
+	for (i = 0; i < 4; i++)
+		IXGBE_WRITE_REG(hw, IXGBE_IPSRXIPADDR(i), addr[i]);
+	IXGBE_WRITE_FLUSH(hw);
+
+	ixgbe_ipsec_set_rx_item(hw, idx, IXGBE_RXIDX_TBL_IP);
+}
+
+/**
+ * ixgbe_ipsec_clear_hw_tables - because some tables don't get cleared on reset
+ * @adapter: board private structure
+ **/
+void ixgbe_ipsec_clear_hw_tables(struct ixgbe_adapter *adapter)
+{
+	struct ixgbe_hw *hw = &adapter->hw;
+	u32 buf[4] = {0, 0, 0, 0};
+	u16 idx;
+
+	/* disable Rx and Tx SA lookup */
+	IXGBE_WRITE_REG(hw, IXGBE_IPSRXIDX, 0);
+	IXGBE_WRITE_REG(hw, IXGBE_IPSTXIDX, 0);
+
+	/* scrub the tables */
+	for (idx = 0; idx < IXGBE_IPSEC_MAX_SA_COUNT; idx++)
+		ixgbe_ipsec_set_tx_sa(hw, idx, buf, 0);
+
+	for (idx = 0; idx < IXGBE_IPSEC_MAX_SA_COUNT; idx++)
+		ixgbe_ipsec_set_rx_sa(hw, idx, 0, buf, 0, 0, 0);
+
+	for (idx = 0; idx < IXGBE_IPSEC_MAX_RX_IP_COUNT; idx++)
+		ixgbe_ipsec_set_rx_ip(hw, idx, buf);
+}
+
+/**
+ * ixgbe_init_ipsec_offload - initialize security registers for IPSec operation
+ * @adapter: board private structure
+ **/
+void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter)
+{
+	ixgbe_ipsec_clear_hw_tables(adapter);
+}
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
new file mode 100644
index 0000000..017b13f
--- /dev/null
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
@@ -0,0 +1,50 @@
+/*******************************************************************************
+
+  Intel 10 Gigabit PCI Express Linux driver
+  Copyright(c) 2017 Oracle and/or its affiliates. All rights reserved.
+
+  This program is free software; you can redistribute it and/or modify it
+  under the terms and conditions of the GNU General Public License,
+  version 2, as published by the Free Software Foundation.
+
+  This program is distributed in the hope it will be useful, but WITHOUT
+  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+  more details.
+
+  You should have received a copy of the GNU General Public License along with
+  this program.  If not, see <http://www.gnu.org/licenses/>.
+
+  The full GNU General Public License is included in this distribution in
+  the file called "COPYING".
+
+  Contact Information:
+  Linux NICS <linux.nics@intel.com>
+  e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
+  Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+
+*******************************************************************************/
+
+#ifndef _IXGBE_IPSEC_H_
+#define _IXGBE_IPSEC_H_
+
+#define IXGBE_IPSEC_MAX_SA_COUNT	1024
+#define IXGBE_IPSEC_MAX_RX_IP_COUNT	128
+#define IXGBE_IPSEC_BASE_RX_INDEX	IXGBE_IPSEC_MAX_SA_COUNT
+#define IXGBE_IPSEC_BASE_TX_INDEX	(IXGBE_IPSEC_MAX_SA_COUNT * 2)
+
+#define IXGBE_RXTXIDX_IPS_EN		0x00000001
+#define IXGBE_RXIDX_TBL_MASK		0x00000006
+#define IXGBE_RXIDX_TBL_IP		0x00000002
+#define IXGBE_RXIDX_TBL_SPI		0x00000004
+#define IXGBE_RXIDX_TBL_KEY		0x00000006
+#define IXGBE_RXTXIDX_IDX_MASK		0x00001ff8
+#define IXGBE_RXTXIDX_IDX_READ		0x40000000
+#define IXGBE_RXTXIDX_IDX_WRITE		0x80000000
+
+#define IXGBE_RXMOD_VALID		0x00000001
+#define IXGBE_RXMOD_PROTO_ESP		0x00000004
+#define IXGBE_RXMOD_DECRYPT		0x00000008
+#define IXGBE_RXMOD_IPV6		0x00000010
+
+#endif /* _IXGBE_IPSEC_H_ */
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 6d5f31e..51fb3cf 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -10327,6 +10327,7 @@ static int ixgbe_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 					 NETIF_F_FCOE_MTU;
 	}
 #endif /* IXGBE_FCOE */
+	ixgbe_init_ipsec_offload(adapter);
 
 	if (adapter->flags2 & IXGBE_FLAG2_RSC_CAPABLE)
 		netdev->hw_features |= NETIF_F_LRO;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [next-queue 03/10] ixgbe: add ipsec engine start and stop routines
  2017-12-05  5:35 ` [Intel-wired-lan] " Shannon Nelson
@ 2017-12-05  5:35   ` Shannon Nelson
  -1 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-05  5:35 UTC (permalink / raw)
  To: intel-wired-lan, jeffrey.t.kirsher
  Cc: steffen.klassert, sowmini.varadhan, netdev

Add in the code for running and stopping the hardware ipsec
encryption/decryption engine.  It is good to keep the engine
off when not in use in order to save on the power draw.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 140 +++++++++++++++++++++++++
 1 file changed, 140 insertions(+)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
index 14dd011..38a1a16 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
@@ -148,10 +148,150 @@ void ixgbe_ipsec_clear_hw_tables(struct ixgbe_adapter *adapter)
 }
 
 /**
+ * ixgbe_ipsec_stop_data
+ * @adapter: board private structure
+ **/
+static void ixgbe_ipsec_stop_data(struct ixgbe_adapter *adapter)
+{
+	struct ixgbe_hw *hw = &adapter->hw;
+	bool link = adapter->link_up;
+	u32 t_rdy, r_rdy;
+	u32 reg;
+
+	/* halt data paths */
+	reg = IXGBE_READ_REG(hw, IXGBE_SECTXCTRL);
+	reg |= IXGBE_SECTXCTRL_TX_DIS;
+	IXGBE_WRITE_REG(hw, IXGBE_SECTXCTRL, reg);
+
+	reg = IXGBE_READ_REG(hw, IXGBE_SECRXCTRL);
+	reg |= IXGBE_SECRXCTRL_RX_DIS;
+	IXGBE_WRITE_REG(hw, IXGBE_SECRXCTRL, reg);
+
+	IXGBE_WRITE_FLUSH(hw);
+
+	/* If the tx fifo doesn't have link, but still has data,
+	 * we can't clear the tx sec block.  Set the MAC loopback
+	 * before block clear
+	 */
+	if (!link) {
+		reg = IXGBE_READ_REG(hw, IXGBE_MACC);
+		reg |= IXGBE_MACC_FLU;
+		IXGBE_WRITE_REG(hw, IXGBE_MACC, reg);
+
+		reg = IXGBE_READ_REG(hw, IXGBE_HLREG0);
+		reg |= IXGBE_HLREG0_LPBK;
+		IXGBE_WRITE_REG(hw, IXGBE_HLREG0, reg);
+
+		IXGBE_WRITE_FLUSH(hw);
+		mdelay(3);
+	}
+
+	/* wait for the paths to empty */
+	do {
+		mdelay(10);
+		t_rdy = IXGBE_READ_REG(hw, IXGBE_SECTXSTAT) &
+			IXGBE_SECTXSTAT_SECTX_RDY;
+		r_rdy = IXGBE_READ_REG(hw, IXGBE_SECRXSTAT) &
+			IXGBE_SECRXSTAT_SECRX_RDY;
+	} while (!t_rdy && !r_rdy);
+
+	/* undo loopback if we played with it earlier */
+	if (!link) {
+		reg = IXGBE_READ_REG(hw, IXGBE_MACC);
+		reg &= ~IXGBE_MACC_FLU;
+		IXGBE_WRITE_REG(hw, IXGBE_MACC, reg);
+
+		reg = IXGBE_READ_REG(hw, IXGBE_HLREG0);
+		reg &= ~IXGBE_HLREG0_LPBK;
+		IXGBE_WRITE_REG(hw, IXGBE_HLREG0, reg);
+
+		IXGBE_WRITE_FLUSH(hw);
+	}
+}
+
+/**
+ * ixgbe_ipsec_stop_engine
+ * @adapter: board private structure
+ **/
+static void ixgbe_ipsec_stop_engine(struct ixgbe_adapter *adapter)
+{
+	struct ixgbe_hw *hw = &adapter->hw;
+	u32 reg;
+
+	ixgbe_ipsec_stop_data(adapter);
+
+	/* disable Rx and Tx SA lookup */
+	IXGBE_WRITE_REG(hw, IXGBE_IPSTXIDX, 0);
+	IXGBE_WRITE_REG(hw, IXGBE_IPSRXIDX, 0);
+
+	/* disable the Rx and Tx engines and full packet store-n-forward */
+	reg = IXGBE_READ_REG(hw, IXGBE_SECTXCTRL);
+	reg |= IXGBE_SECTXCTRL_SECTX_DIS;
+	reg &= ~IXGBE_SECTXCTRL_STORE_FORWARD;
+	IXGBE_WRITE_REG(hw, IXGBE_SECTXCTRL, reg);
+
+	reg = IXGBE_READ_REG(hw, IXGBE_SECRXCTRL);
+	reg |= IXGBE_SECRXCTRL_SECRX_DIS;
+	IXGBE_WRITE_REG(hw, IXGBE_SECRXCTRL, reg);
+
+	/* restore the "tx security buffer almost full threshold" to 0x250 */
+	IXGBE_WRITE_REG(hw, IXGBE_SECTXBUFFAF, 0x250);
+
+	/* Set minimum IFG between packets back to the default 0x1 */
+	reg = IXGBE_READ_REG(hw, IXGBE_SECTXMINIFG);
+	reg = (reg & 0xfffffff0) | 0x1;
+	IXGBE_WRITE_REG(hw, IXGBE_SECTXMINIFG, reg);
+
+	/* final set for normal (no ipsec offload) processing */
+	IXGBE_WRITE_REG(hw, IXGBE_SECTXCTRL, IXGBE_SECTXCTRL_SECTX_DIS);
+	IXGBE_WRITE_REG(hw, IXGBE_SECRXCTRL, IXGBE_SECRXCTRL_SECRX_DIS);
+
+	IXGBE_WRITE_FLUSH(hw);
+}
+
+/**
+ * ixgbe_ipsec_start_engine
+ * @adapter: board private structure
+ *
+ * NOTE: this increases power consumption whether being used or not
+ **/
+static void ixgbe_ipsec_start_engine(struct ixgbe_adapter *adapter)
+{
+	struct ixgbe_hw *hw = &adapter->hw;
+	u32 reg;
+
+	ixgbe_ipsec_stop_data(adapter);
+
+	/* Set minimum IFG between packets to 3 */
+	reg = IXGBE_READ_REG(hw, IXGBE_SECTXMINIFG);
+	reg = (reg & 0xfffffff0) | 0x3;
+	IXGBE_WRITE_REG(hw, IXGBE_SECTXMINIFG, reg);
+
+	/* Set "tx security buffer almost full threshold" to 0x15 so that the
+	 * almost full indication is generated only after buffer contains at
+	 * least an entire jumbo packet.
+	 */
+	reg = IXGBE_READ_REG(hw, IXGBE_SECTXBUFFAF);
+	reg = (reg & 0xfffffc00) | 0x15;
+	IXGBE_WRITE_REG(hw, IXGBE_SECTXBUFFAF, reg);
+
+	/* restart the data paths by clearing the DISABLE bits */
+	IXGBE_WRITE_REG(hw, IXGBE_SECRXCTRL, 0);
+	IXGBE_WRITE_REG(hw, IXGBE_SECTXCTRL, IXGBE_SECTXCTRL_STORE_FORWARD);
+
+	/* enable Rx and Tx SA lookup */
+	IXGBE_WRITE_REG(hw, IXGBE_IPSTXIDX, IXGBE_RXTXIDX_IPS_EN);
+	IXGBE_WRITE_REG(hw, IXGBE_IPSRXIDX, IXGBE_RXTXIDX_IPS_EN);
+
+	IXGBE_WRITE_FLUSH(hw);
+}
+
+/**
  * ixgbe_init_ipsec_offload - initialize security registers for IPSec operation
  * @adapter: board private structure
  **/
 void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter)
 {
 	ixgbe_ipsec_clear_hw_tables(adapter);
+	ixgbe_ipsec_stop_engine(adapter);
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 03/10] ixgbe: add ipsec engine start and stop routines
@ 2017-12-05  5:35   ` Shannon Nelson
  0 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-05  5:35 UTC (permalink / raw)
  To: intel-wired-lan

Add in the code for running and stopping the hardware ipsec
encryption/decryption engine.  It is good to keep the engine
off when not in use in order to save on the power draw.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 140 +++++++++++++++++++++++++
 1 file changed, 140 insertions(+)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
index 14dd011..38a1a16 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
@@ -148,10 +148,150 @@ void ixgbe_ipsec_clear_hw_tables(struct ixgbe_adapter *adapter)
 }
 
 /**
+ * ixgbe_ipsec_stop_data
+ * @adapter: board private structure
+ **/
+static void ixgbe_ipsec_stop_data(struct ixgbe_adapter *adapter)
+{
+	struct ixgbe_hw *hw = &adapter->hw;
+	bool link = adapter->link_up;
+	u32 t_rdy, r_rdy;
+	u32 reg;
+
+	/* halt data paths */
+	reg = IXGBE_READ_REG(hw, IXGBE_SECTXCTRL);
+	reg |= IXGBE_SECTXCTRL_TX_DIS;
+	IXGBE_WRITE_REG(hw, IXGBE_SECTXCTRL, reg);
+
+	reg = IXGBE_READ_REG(hw, IXGBE_SECRXCTRL);
+	reg |= IXGBE_SECRXCTRL_RX_DIS;
+	IXGBE_WRITE_REG(hw, IXGBE_SECRXCTRL, reg);
+
+	IXGBE_WRITE_FLUSH(hw);
+
+	/* If the tx fifo doesn't have link, but still has data,
+	 * we can't clear the tx sec block.  Set the MAC loopback
+	 * before block clear
+	 */
+	if (!link) {
+		reg = IXGBE_READ_REG(hw, IXGBE_MACC);
+		reg |= IXGBE_MACC_FLU;
+		IXGBE_WRITE_REG(hw, IXGBE_MACC, reg);
+
+		reg = IXGBE_READ_REG(hw, IXGBE_HLREG0);
+		reg |= IXGBE_HLREG0_LPBK;
+		IXGBE_WRITE_REG(hw, IXGBE_HLREG0, reg);
+
+		IXGBE_WRITE_FLUSH(hw);
+		mdelay(3);
+	}
+
+	/* wait for the paths to empty */
+	do {
+		mdelay(10);
+		t_rdy = IXGBE_READ_REG(hw, IXGBE_SECTXSTAT) &
+			IXGBE_SECTXSTAT_SECTX_RDY;
+		r_rdy = IXGBE_READ_REG(hw, IXGBE_SECRXSTAT) &
+			IXGBE_SECRXSTAT_SECRX_RDY;
+	} while (!t_rdy && !r_rdy);
+
+	/* undo loopback if we played with it earlier */
+	if (!link) {
+		reg = IXGBE_READ_REG(hw, IXGBE_MACC);
+		reg &= ~IXGBE_MACC_FLU;
+		IXGBE_WRITE_REG(hw, IXGBE_MACC, reg);
+
+		reg = IXGBE_READ_REG(hw, IXGBE_HLREG0);
+		reg &= ~IXGBE_HLREG0_LPBK;
+		IXGBE_WRITE_REG(hw, IXGBE_HLREG0, reg);
+
+		IXGBE_WRITE_FLUSH(hw);
+	}
+}
+
+/**
+ * ixgbe_ipsec_stop_engine
+ * @adapter: board private structure
+ **/
+static void ixgbe_ipsec_stop_engine(struct ixgbe_adapter *adapter)
+{
+	struct ixgbe_hw *hw = &adapter->hw;
+	u32 reg;
+
+	ixgbe_ipsec_stop_data(adapter);
+
+	/* disable Rx and Tx SA lookup */
+	IXGBE_WRITE_REG(hw, IXGBE_IPSTXIDX, 0);
+	IXGBE_WRITE_REG(hw, IXGBE_IPSRXIDX, 0);
+
+	/* disable the Rx and Tx engines and full packet store-n-forward */
+	reg = IXGBE_READ_REG(hw, IXGBE_SECTXCTRL);
+	reg |= IXGBE_SECTXCTRL_SECTX_DIS;
+	reg &= ~IXGBE_SECTXCTRL_STORE_FORWARD;
+	IXGBE_WRITE_REG(hw, IXGBE_SECTXCTRL, reg);
+
+	reg = IXGBE_READ_REG(hw, IXGBE_SECRXCTRL);
+	reg |= IXGBE_SECRXCTRL_SECRX_DIS;
+	IXGBE_WRITE_REG(hw, IXGBE_SECRXCTRL, reg);
+
+	/* restore the "tx security buffer almost full threshold" to 0x250 */
+	IXGBE_WRITE_REG(hw, IXGBE_SECTXBUFFAF, 0x250);
+
+	/* Set minimum IFG between packets back to the default 0x1 */
+	reg = IXGBE_READ_REG(hw, IXGBE_SECTXMINIFG);
+	reg = (reg & 0xfffffff0) | 0x1;
+	IXGBE_WRITE_REG(hw, IXGBE_SECTXMINIFG, reg);
+
+	/* final set for normal (no ipsec offload) processing */
+	IXGBE_WRITE_REG(hw, IXGBE_SECTXCTRL, IXGBE_SECTXCTRL_SECTX_DIS);
+	IXGBE_WRITE_REG(hw, IXGBE_SECRXCTRL, IXGBE_SECRXCTRL_SECRX_DIS);
+
+	IXGBE_WRITE_FLUSH(hw);
+}
+
+/**
+ * ixgbe_ipsec_start_engine
+ * @adapter: board private structure
+ *
+ * NOTE: this increases power consumption whether being used or not
+ **/
+static void ixgbe_ipsec_start_engine(struct ixgbe_adapter *adapter)
+{
+	struct ixgbe_hw *hw = &adapter->hw;
+	u32 reg;
+
+	ixgbe_ipsec_stop_data(adapter);
+
+	/* Set minimum IFG between packets to 3 */
+	reg = IXGBE_READ_REG(hw, IXGBE_SECTXMINIFG);
+	reg = (reg & 0xfffffff0) | 0x3;
+	IXGBE_WRITE_REG(hw, IXGBE_SECTXMINIFG, reg);
+
+	/* Set "tx security buffer almost full threshold" to 0x15 so that the
+	 * almost full indication is generated only after buffer contains at
+	 * least an entire jumbo packet.
+	 */
+	reg = IXGBE_READ_REG(hw, IXGBE_SECTXBUFFAF);
+	reg = (reg & 0xfffffc00) | 0x15;
+	IXGBE_WRITE_REG(hw, IXGBE_SECTXBUFFAF, reg);
+
+	/* restart the data paths by clearing the DISABLE bits */
+	IXGBE_WRITE_REG(hw, IXGBE_SECRXCTRL, 0);
+	IXGBE_WRITE_REG(hw, IXGBE_SECTXCTRL, IXGBE_SECTXCTRL_STORE_FORWARD);
+
+	/* enable Rx and Tx SA lookup */
+	IXGBE_WRITE_REG(hw, IXGBE_IPSTXIDX, IXGBE_RXTXIDX_IPS_EN);
+	IXGBE_WRITE_REG(hw, IXGBE_IPSRXIDX, IXGBE_RXTXIDX_IPS_EN);
+
+	IXGBE_WRITE_FLUSH(hw);
+}
+
+/**
  * ixgbe_init_ipsec_offload - initialize security registers for IPSec operation
  * @adapter: board private structure
  **/
 void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter)
 {
 	ixgbe_ipsec_clear_hw_tables(adapter);
+	ixgbe_ipsec_stop_engine(adapter);
 }
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [next-queue 04/10] ixgbe: add ipsec data structures
  2017-12-05  5:35 ` [Intel-wired-lan] " Shannon Nelson
@ 2017-12-05  5:35   ` Shannon Nelson
  -1 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-05  5:35 UTC (permalink / raw)
  To: intel-wired-lan, jeffrey.t.kirsher
  Cc: steffen.klassert, sowmini.varadhan, netdev

Set up the data structures to be used by the ipsec offload.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe.h       |  5 ++++
 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h | 40 ++++++++++++++++++++++++++
 2 files changed, 45 insertions(+)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index 1e11462..9487750 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -622,6 +622,7 @@ struct ixgbe_adapter {
 #define IXGBE_FLAG2_EEE_CAPABLE			BIT(14)
 #define IXGBE_FLAG2_EEE_ENABLED			BIT(15)
 #define IXGBE_FLAG2_RX_LEGACY			BIT(16)
+#define IXGBE_FLAG2_IPSEC_ENABLED		BIT(17)
 
 	/* Tx fast path data */
 	int num_tx_queues;
@@ -772,6 +773,10 @@ struct ixgbe_adapter {
 
 #define IXGBE_RSS_KEY_SIZE     40  /* size of RSS Hash Key in bytes */
 	u32 *rss_key;
+
+#ifdef CONFIG_XFRM
+	struct ixgbe_ipsec *ipsec;
+#endif /* CONFIG_XFRM */
 };
 
 static inline u8 ixgbe_max_rss_indices(struct ixgbe_adapter *adapter)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
index 017b13f..cb9a4be 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
@@ -47,4 +47,44 @@
 #define IXGBE_RXMOD_DECRYPT		0x00000008
 #define IXGBE_RXMOD_IPV6		0x00000010
 
+struct rx_sa {
+	struct hlist_node hlist;
+	struct xfrm_state *xs;
+	u32 ipaddr[4];
+	u32 key[4];
+	u32 salt;
+	u32 mode;
+	u8  iptbl_ind;
+	bool used;
+	bool decrypt;
+};
+
+struct rx_ip_sa {
+	u32 ipaddr[4];
+	u32 ref_cnt;
+	bool used;
+};
+
+struct tx_sa {
+	struct xfrm_state *xs;
+	u32 key[4];
+	u32 salt;
+	bool encrypt;
+	bool used;
+};
+
+struct ixgbe_ipsec_tx_data {
+	u32 flags;
+	u16 trailer_len;
+	u16 sa_idx;
+};
+
+struct ixgbe_ipsec {
+	u16 num_rx_sa;
+	u16 num_tx_sa;
+	struct rx_ip_sa *ip_tbl;
+	struct rx_sa *rx_tbl;
+	struct tx_sa *tx_tbl;
+	DECLARE_HASHTABLE(rx_sa_list, 8);
+};
 #endif /* _IXGBE_IPSEC_H_ */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 04/10] ixgbe: add ipsec data structures
@ 2017-12-05  5:35   ` Shannon Nelson
  0 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-05  5:35 UTC (permalink / raw)
  To: intel-wired-lan

Set up the data structures to be used by the ipsec offload.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe.h       |  5 ++++
 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h | 40 ++++++++++++++++++++++++++
 2 files changed, 45 insertions(+)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index 1e11462..9487750 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -622,6 +622,7 @@ struct ixgbe_adapter {
 #define IXGBE_FLAG2_EEE_CAPABLE			BIT(14)
 #define IXGBE_FLAG2_EEE_ENABLED			BIT(15)
 #define IXGBE_FLAG2_RX_LEGACY			BIT(16)
+#define IXGBE_FLAG2_IPSEC_ENABLED		BIT(17)
 
 	/* Tx fast path data */
 	int num_tx_queues;
@@ -772,6 +773,10 @@ struct ixgbe_adapter {
 
 #define IXGBE_RSS_KEY_SIZE     40  /* size of RSS Hash Key in bytes */
 	u32 *rss_key;
+
+#ifdef CONFIG_XFRM
+	struct ixgbe_ipsec *ipsec;
+#endif /* CONFIG_XFRM */
 };
 
 static inline u8 ixgbe_max_rss_indices(struct ixgbe_adapter *adapter)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
index 017b13f..cb9a4be 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
@@ -47,4 +47,44 @@
 #define IXGBE_RXMOD_DECRYPT		0x00000008
 #define IXGBE_RXMOD_IPV6		0x00000010
 
+struct rx_sa {
+	struct hlist_node hlist;
+	struct xfrm_state *xs;
+	u32 ipaddr[4];
+	u32 key[4];
+	u32 salt;
+	u32 mode;
+	u8  iptbl_ind;
+	bool used;
+	bool decrypt;
+};
+
+struct rx_ip_sa {
+	u32 ipaddr[4];
+	u32 ref_cnt;
+	bool used;
+};
+
+struct tx_sa {
+	struct xfrm_state *xs;
+	u32 key[4];
+	u32 salt;
+	bool encrypt;
+	bool used;
+};
+
+struct ixgbe_ipsec_tx_data {
+	u32 flags;
+	u16 trailer_len;
+	u16 sa_idx;
+};
+
+struct ixgbe_ipsec {
+	u16 num_rx_sa;
+	u16 num_tx_sa;
+	struct rx_ip_sa *ip_tbl;
+	struct rx_sa *rx_tbl;
+	struct tx_sa *tx_tbl;
+	DECLARE_HASHTABLE(rx_sa_list, 8);
+};
 #endif /* _IXGBE_IPSEC_H_ */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [next-queue 05/10] ixgbe: implement ipsec add and remove of offloaded SA
  2017-12-05  5:35 ` [Intel-wired-lan] " Shannon Nelson
@ 2017-12-05  5:35   ` Shannon Nelson
  -1 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-05  5:35 UTC (permalink / raw)
  To: intel-wired-lan, jeffrey.t.kirsher
  Cc: steffen.klassert, sowmini.varadhan, netdev

Add the functions for setting up and removing offloaded SAs (Security
Associations) with the x540 hardware.  We set up the callback structure
but we don't yet set the hardware feature bit to be sure the XFRM service
won't actually try to use us for an offload yet.

The software tables are made up to mimic the hardware tables to make it
easier to track what's in the hardware, and the SA table index is used
for the XFRM offload handle.  However, there is a hashing field in the
Rx SA tracking that will be used to facilitate faster table searches in
the Rx fast path.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 377 +++++++++++++++++++++++++
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  |   6 +
 2 files changed, 383 insertions(+)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
index 38a1a16..7b01d92 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
@@ -26,6 +26,8 @@
  ******************************************************************************/
 
 #include "ixgbe.h"
+#include <net/xfrm.h>
+#include <crypto/aead.h>
 
 /**
  * ixgbe_ipsec_set_tx_sa - set the Tx SA registers
@@ -128,6 +130,7 @@ static void ixgbe_ipsec_set_rx_ip(struct ixgbe_hw *hw, u16 idx, u32 addr[])
  **/
 void ixgbe_ipsec_clear_hw_tables(struct ixgbe_adapter *adapter)
 {
+	struct ixgbe_ipsec *ipsec = adapter->ipsec;
 	struct ixgbe_hw *hw = &adapter->hw;
 	u32 buf[4] = {0, 0, 0, 0};
 	u16 idx;
@@ -139,9 +142,11 @@ void ixgbe_ipsec_clear_hw_tables(struct ixgbe_adapter *adapter)
 	/* scrub the tables */
 	for (idx = 0; idx < IXGBE_IPSEC_MAX_SA_COUNT; idx++)
 		ixgbe_ipsec_set_tx_sa(hw, idx, buf, 0);
+	ipsec->num_tx_sa = 0;
 
 	for (idx = 0; idx < IXGBE_IPSEC_MAX_SA_COUNT; idx++)
 		ixgbe_ipsec_set_rx_sa(hw, idx, 0, buf, 0, 0, 0);
+	ipsec->num_rx_sa = 0;
 
 	for (idx = 0; idx < IXGBE_IPSEC_MAX_RX_IP_COUNT; idx++)
 		ixgbe_ipsec_set_rx_ip(hw, idx, buf);
@@ -287,11 +292,383 @@ static void ixgbe_ipsec_start_engine(struct ixgbe_adapter *adapter)
 }
 
 /**
+ * ixgbe_ipsec_find_empty_idx - find the first unused security parameter index
+ * @ipsec: pointer to ipsec struct
+ * @rxtable: true if we need to look in the Rx table
+ *
+ * Returns the first unused index in either the Rx or Tx SA table
+ **/
+static int ixgbe_ipsec_find_empty_idx(struct ixgbe_ipsec *ipsec, bool rxtable)
+{
+	u32 i;
+
+	if (rxtable) {
+		if (ipsec->num_rx_sa == IXGBE_IPSEC_MAX_SA_COUNT)
+			return -ENOSPC;
+
+		/* search rx sa table */
+		for (i = 0; i < IXGBE_IPSEC_MAX_SA_COUNT; i++) {
+			if (!ipsec->rx_tbl[i].used)
+				return i;
+		}
+	} else {
+		if (ipsec->num_rx_sa == IXGBE_IPSEC_MAX_SA_COUNT)
+			return -ENOSPC;
+
+		/* search tx sa table */
+		for (i = 0; i < IXGBE_IPSEC_MAX_SA_COUNT; i++) {
+			if (!ipsec->tx_tbl[i].used)
+				return i;
+		}
+	}
+
+	return -ENOSPC;
+}
+
+/**
+ * ixgbe_ipsec_parse_proto_keys - find the key and salt based on the protocol
+ * @xs: pointer to xfrm_state struct
+ * @mykey: pointer to key array to populate
+ * @mysalt: pointer to salt value to populate
+ *
+ * This copies the protocol keys and salt to our own data tables.  The
+ * 82599 family only supports the one algorithm.
+ **/
+static int ixgbe_ipsec_parse_proto_keys(struct xfrm_state *xs,
+					u32 *mykey, u32 *mysalt)
+{
+	struct net_device *dev = xs->xso.dev;
+	unsigned char *key_data;
+	char *alg_name = NULL;
+	char *aes_gcm_name = "rfc4106(gcm(aes))";
+	int key_len;
+
+	if (xs->aead) {
+		key_data = &xs->aead->alg_key[0];
+		key_len = xs->aead->alg_key_len;
+		alg_name = xs->aead->alg_name;
+	} else {
+		netdev_err(dev, "Unsupported IPsec algorithm\n");
+		return -EINVAL;
+	}
+
+	if (strcmp(alg_name, aes_gcm_name)) {
+		netdev_err(dev, "Unsupported IPsec algorithm - please use %s\n",
+			   aes_gcm_name);
+		return -EINVAL;
+	}
+
+	/* 160 accounts for 16 byte key and 4 byte salt */
+	if (key_len == 128) {
+		netdev_info(dev, "IPsec hw offload parameters missing 32 bit salt value\n");
+	} else if (key_len != 160) {
+		netdev_err(dev, "IPsec hw offload only supports keys up to 128 bits with a 32 bit salt\n");
+		return -EINVAL;
+	}
+
+	/* The key bytes come down in a bigendian array of bytes, and
+	 * salt is always the last 4 bytes of the key array.
+	 * We don't need to do any byteswapping.
+	 */
+	memcpy(mykey, key_data, 16);
+	if (key_len == 160)
+		*mysalt = ((u32 *)key_data)[4];
+	else
+		*mysalt = 0;
+
+	return 0;
+}
+
+/**
+ * ixgbe_ipsec_add_sa - program device with a security association
+ * @xs: pointer to transformer state struct
+ **/
+static int ixgbe_ipsec_add_sa(struct xfrm_state *xs)
+{
+	struct net_device *dev = xs->xso.dev;
+	struct ixgbe_adapter *adapter = netdev_priv(dev);
+	struct ixgbe_ipsec *ipsec = adapter->ipsec;
+	struct ixgbe_hw *hw = &adapter->hw;
+	int checked, match, first;
+	u16 sa_idx;
+	int ret;
+	int i;
+
+	if (xs->id.proto != IPPROTO_ESP && xs->id.proto != IPPROTO_AH) {
+		netdev_err(dev, "Unsupported protocol 0x%04x for ipsec offload\n",
+			   xs->id.proto);
+		return -EINVAL;
+	}
+
+	if (xs->xso.flags & XFRM_OFFLOAD_INBOUND) {
+		struct rx_sa rsa;
+
+		if (xs->calg) {
+			netdev_err(dev, "Compression offload not supported\n");
+			return -EINVAL;
+		}
+
+		/* find the first unused index */
+		ret = ixgbe_ipsec_find_empty_idx(ipsec, true);
+		if (ret < 0) {
+			netdev_err(dev, "No space for SA in Rx table!\n");
+			return ret;
+		}
+		sa_idx = (u16)ret;
+
+		memset(&rsa, 0, sizeof(rsa));
+		rsa.used = true;
+		rsa.xs = xs;
+
+		if (rsa.xs->id.proto & IPPROTO_ESP)
+			rsa.decrypt = xs->ealg || xs->aead;
+
+		/* get the key and salt */
+		ret = ixgbe_ipsec_parse_proto_keys(xs, rsa.key, &rsa.salt);
+		if (ret) {
+			netdev_err(dev, "Failed to get key data for Rx SA table\n");
+			return ret;
+		}
+
+		/* get ip for rx sa table */
+		if (xs->xso.flags & XFRM_OFFLOAD_IPV6)
+			memcpy(rsa.ipaddr, &xs->id.daddr.a6, 16);
+		else
+			memcpy(&rsa.ipaddr[3], &xs->id.daddr.a4, 4);
+
+		/* The HW does not have a 1:1 mapping from keys to IP addrs, so
+		 * check for a matching IP addr entry in the table.  If the addr
+		 * already exists, use it; else find an unused slot and add the
+		 * addr.  If one does not exist and there are no unused table
+		 * entries, fail the request.
+		 */
+
+		/* Find an existing match or first not used, and stop looking
+		 * after we've checked all we know we have.
+		 */
+		checked = 0;
+		match = -1;
+		first = -1;
+		for (i = 0;
+		     i < IXGBE_IPSEC_MAX_RX_IP_COUNT &&
+		     (checked < ipsec->num_rx_sa || first < 0);
+		     i++) {
+			if (ipsec->ip_tbl[i].used) {
+				if (!memcmp(ipsec->ip_tbl[i].ipaddr,
+					    rsa.ipaddr, sizeof(rsa.ipaddr))) {
+					match = i;
+					break;
+				}
+				checked++;
+			} else if (first < 0) {
+				first = i;  /* track the first empty seen */
+			}
+		}
+
+		if (ipsec->num_rx_sa == 0)
+			first = 0;
+
+		if (match >= 0) {
+			/* addrs are the same, we should use this one */
+			rsa.iptbl_ind = match;
+			ipsec->ip_tbl[match].ref_cnt++;
+
+		} else if (first >= 0) {
+			/* no matches, but here's an empty slot */
+			rsa.iptbl_ind = first;
+
+			memcpy(ipsec->ip_tbl[first].ipaddr,
+			       rsa.ipaddr, sizeof(rsa.ipaddr));
+			ipsec->ip_tbl[first].ref_cnt = 1;
+			ipsec->ip_tbl[first].used = true;
+
+			ixgbe_ipsec_set_rx_ip(hw, rsa.iptbl_ind, rsa.ipaddr);
+
+		} else {
+			/* no match and no empty slot */
+			netdev_err(dev, "No space for SA in Rx IP SA table\n");
+			memset(&rsa, 0, sizeof(rsa));
+			return -ENOSPC;
+		}
+
+		rsa.mode = IXGBE_RXMOD_VALID;
+		if (rsa.xs->id.proto & IPPROTO_ESP)
+			rsa.mode |= IXGBE_RXMOD_PROTO_ESP;
+		if (rsa.decrypt)
+			rsa.mode |= IXGBE_RXMOD_DECRYPT;
+		if (rsa.xs->xso.flags & XFRM_OFFLOAD_IPV6)
+			rsa.mode |= IXGBE_RXMOD_IPV6;
+
+		/* the preparations worked, so save the info */
+		memcpy(&ipsec->rx_tbl[sa_idx], &rsa, sizeof(rsa));
+
+		ixgbe_ipsec_set_rx_sa(hw, sa_idx, rsa.xs->id.spi, rsa.key,
+				      rsa.salt, rsa.mode, rsa.iptbl_ind);
+		xs->xso.offload_handle = sa_idx + IXGBE_IPSEC_BASE_RX_INDEX;
+
+		ipsec->num_rx_sa++;
+
+		/* hash the new entry for faster search in Rx path */
+		hash_add_rcu(ipsec->rx_sa_list, &ipsec->rx_tbl[sa_idx].hlist,
+			     rsa.xs->id.spi);
+	} else {
+		struct tx_sa tsa;
+
+		/* find the first unused index */
+		ret = ixgbe_ipsec_find_empty_idx(ipsec, false);
+		if (ret < 0) {
+			netdev_err(dev, "No space for SA in Tx table\n");
+			return ret;
+		}
+		sa_idx = (u16)ret;
+
+		memset(&tsa, 0, sizeof(tsa));
+		tsa.used = true;
+		tsa.xs = xs;
+
+		if (xs->id.proto & IPPROTO_ESP)
+			tsa.encrypt = xs->ealg || xs->aead;
+
+		ret = ixgbe_ipsec_parse_proto_keys(xs, tsa.key, &tsa.salt);
+		if (ret) {
+			netdev_err(dev, "Failed to get key data for Tx SA table\n");
+			memset(&tsa, 0, sizeof(tsa));
+			return ret;
+		}
+
+		/* the preparations worked, so save the info */
+		memcpy(&ipsec->tx_tbl[sa_idx], &tsa, sizeof(tsa));
+
+		ixgbe_ipsec_set_tx_sa(hw, sa_idx, tsa.key, tsa.salt);
+
+		xs->xso.offload_handle = sa_idx + IXGBE_IPSEC_BASE_TX_INDEX;
+
+		ipsec->num_tx_sa++;
+	}
+
+	/* enable the engine if not already warmed up */
+	if (!(adapter->flags2 & IXGBE_FLAG2_IPSEC_ENABLED)) {
+		ixgbe_ipsec_start_engine(adapter);
+		adapter->flags2 |= IXGBE_FLAG2_IPSEC_ENABLED;
+	}
+
+	return 0;
+}
+
+/**
+ * ixgbe_ipsec_del_sa - clear out this specific SA
+ * @xs: pointer to transformer state struct
+ **/
+static void ixgbe_ipsec_del_sa(struct xfrm_state *xs)
+{
+	struct net_device *dev = xs->xso.dev;
+	struct ixgbe_adapter *adapter = netdev_priv(dev);
+	struct ixgbe_ipsec *ipsec = adapter->ipsec;
+	struct ixgbe_hw *hw = &adapter->hw;
+	u32 zerobuf[4] = {0, 0, 0, 0};
+	u16 sa_idx;
+
+	if (xs->xso.flags & XFRM_OFFLOAD_INBOUND) {
+		struct rx_sa *rsa;
+		u8 ipi;
+
+		sa_idx = xs->xso.offload_handle - IXGBE_IPSEC_BASE_RX_INDEX;
+		rsa = &ipsec->rx_tbl[sa_idx];
+
+		if (!rsa->used) {
+			netdev_err(dev, "Invalid Rx SA selected sa_idx=%d offload_handle=%lu\n",
+				   sa_idx, xs->xso.offload_handle);
+			return;
+		}
+
+		ixgbe_ipsec_set_rx_sa(hw, sa_idx, 0, zerobuf, 0, 0, 0);
+		hash_del_rcu(&rsa->hlist);
+
+		/* if the IP table entry is referenced by only this SA,
+		 * i.e. ref_cnt is only 1, clear the IP table entry as well
+		 */
+		ipi = rsa->iptbl_ind;
+		if (ipsec->ip_tbl[ipi].ref_cnt > 0) {
+			ipsec->ip_tbl[ipi].ref_cnt--;
+
+			if (!ipsec->ip_tbl[ipi].ref_cnt) {
+				memset(&ipsec->ip_tbl[ipi], 0,
+				       sizeof(struct rx_ip_sa));
+				ixgbe_ipsec_set_rx_ip(hw, ipi, zerobuf);
+			}
+		}
+
+		memset(rsa, 0, sizeof(struct rx_sa));
+		ipsec->num_rx_sa--;
+	} else {
+		sa_idx = xs->xso.offload_handle - IXGBE_IPSEC_BASE_TX_INDEX;
+
+		if (!ipsec->tx_tbl[sa_idx].used) {
+			netdev_err(dev, "Invalid Tx SA selected sa_idx=%d offload_handle=%lu\n",
+				   sa_idx, xs->xso.offload_handle);
+			return;
+		}
+
+		ixgbe_ipsec_set_tx_sa(hw, sa_idx, zerobuf, 0);
+		memset(&ipsec->tx_tbl[sa_idx], 0, sizeof(struct tx_sa));
+		ipsec->num_tx_sa--;
+	}
+
+	/* if there are no SAs left, stop the engine to save energy */
+	if (ipsec->num_rx_sa == 0 && ipsec->num_tx_sa == 0) {
+		adapter->flags2 &= ~IXGBE_FLAG2_IPSEC_ENABLED;
+		ixgbe_ipsec_stop_engine(adapter);
+	}
+}
+
+static const struct xfrmdev_ops ixgbe_xfrmdev_ops = {
+	.xdo_dev_state_add = ixgbe_ipsec_add_sa,
+	.xdo_dev_state_delete = ixgbe_ipsec_del_sa,
+};
+
+/**
  * ixgbe_init_ipsec_offload - initialize security registers for IPSec operation
  * @adapter: board private structure
  **/
 void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter)
 {
+	struct ixgbe_ipsec *ipsec;
+	size_t size;
+
+	ipsec = kzalloc(sizeof(*ipsec), GFP_KERNEL);
+	if (!ipsec)
+		goto err;
+	hash_init(ipsec->rx_sa_list);
+
+	size = sizeof(struct rx_sa) * IXGBE_IPSEC_MAX_SA_COUNT;
+	ipsec->rx_tbl = kzalloc(size, GFP_KERNEL);
+	if (!ipsec->rx_tbl)
+		goto err;
+
+	size = sizeof(struct tx_sa) * IXGBE_IPSEC_MAX_SA_COUNT;
+	ipsec->tx_tbl = kzalloc(size, GFP_KERNEL);
+	if (!ipsec->tx_tbl)
+		goto err;
+
+	size = sizeof(struct rx_ip_sa) * IXGBE_IPSEC_MAX_RX_IP_COUNT;
+	ipsec->ip_tbl = kzalloc(size, GFP_KERNEL);
+	if (!ipsec->ip_tbl)
+		goto err;
+
+	ipsec->num_rx_sa = 0;
+	ipsec->num_tx_sa = 0;
+
+	adapter->ipsec = ipsec;
 	ixgbe_ipsec_clear_hw_tables(adapter);
 	ixgbe_ipsec_stop_engine(adapter);
+
+	return;
+err:
+	if (ipsec) {
+		kfree(ipsec->ip_tbl);
+		kfree(ipsec->rx_tbl);
+		kfree(ipsec->tx_tbl);
+		kfree(adapter->ipsec);
+	}
+	netdev_err(adapter->netdev, "Unable to allocate memory for SA tables");
 }
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 51fb3cf..01fd89b 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -10542,6 +10542,12 @@ static void ixgbe_remove(struct pci_dev *pdev)
 	set_bit(__IXGBE_REMOVING, &adapter->state);
 	cancel_work_sync(&adapter->service_task);
 
+#ifdef CONFIG_XFRM
+	kfree(adapter->ipsec->ip_tbl);
+	kfree(adapter->ipsec->rx_tbl);
+	kfree(adapter->ipsec->tx_tbl);
+	kfree(adapter->ipsec);
+#endif /* CONFIG_XFRM */
 
 #ifdef CONFIG_IXGBE_DCA
 	if (adapter->flags & IXGBE_FLAG_DCA_ENABLED) {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 05/10] ixgbe: implement ipsec add and remove of offloaded SA
@ 2017-12-05  5:35   ` Shannon Nelson
  0 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-05  5:35 UTC (permalink / raw)
  To: intel-wired-lan

Add the functions for setting up and removing offloaded SAs (Security
Associations) with the x540 hardware.  We set up the callback structure
but we don't yet set the hardware feature bit to be sure the XFRM service
won't actually try to use us for an offload yet.

The software tables are made up to mimic the hardware tables to make it
easier to track what's in the hardware, and the SA table index is used
for the XFRM offload handle.  However, there is a hashing field in the
Rx SA tracking that will be used to facilitate faster table searches in
the Rx fast path.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 377 +++++++++++++++++++++++++
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  |   6 +
 2 files changed, 383 insertions(+)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
index 38a1a16..7b01d92 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
@@ -26,6 +26,8 @@
  ******************************************************************************/
 
 #include "ixgbe.h"
+#include <net/xfrm.h>
+#include <crypto/aead.h>
 
 /**
  * ixgbe_ipsec_set_tx_sa - set the Tx SA registers
@@ -128,6 +130,7 @@ static void ixgbe_ipsec_set_rx_ip(struct ixgbe_hw *hw, u16 idx, u32 addr[])
  **/
 void ixgbe_ipsec_clear_hw_tables(struct ixgbe_adapter *adapter)
 {
+	struct ixgbe_ipsec *ipsec = adapter->ipsec;
 	struct ixgbe_hw *hw = &adapter->hw;
 	u32 buf[4] = {0, 0, 0, 0};
 	u16 idx;
@@ -139,9 +142,11 @@ void ixgbe_ipsec_clear_hw_tables(struct ixgbe_adapter *adapter)
 	/* scrub the tables */
 	for (idx = 0; idx < IXGBE_IPSEC_MAX_SA_COUNT; idx++)
 		ixgbe_ipsec_set_tx_sa(hw, idx, buf, 0);
+	ipsec->num_tx_sa = 0;
 
 	for (idx = 0; idx < IXGBE_IPSEC_MAX_SA_COUNT; idx++)
 		ixgbe_ipsec_set_rx_sa(hw, idx, 0, buf, 0, 0, 0);
+	ipsec->num_rx_sa = 0;
 
 	for (idx = 0; idx < IXGBE_IPSEC_MAX_RX_IP_COUNT; idx++)
 		ixgbe_ipsec_set_rx_ip(hw, idx, buf);
@@ -287,11 +292,383 @@ static void ixgbe_ipsec_start_engine(struct ixgbe_adapter *adapter)
 }
 
 /**
+ * ixgbe_ipsec_find_empty_idx - find the first unused security parameter index
+ * @ipsec: pointer to ipsec struct
+ * @rxtable: true if we need to look in the Rx table
+ *
+ * Returns the first unused index in either the Rx or Tx SA table
+ **/
+static int ixgbe_ipsec_find_empty_idx(struct ixgbe_ipsec *ipsec, bool rxtable)
+{
+	u32 i;
+
+	if (rxtable) {
+		if (ipsec->num_rx_sa == IXGBE_IPSEC_MAX_SA_COUNT)
+			return -ENOSPC;
+
+		/* search rx sa table */
+		for (i = 0; i < IXGBE_IPSEC_MAX_SA_COUNT; i++) {
+			if (!ipsec->rx_tbl[i].used)
+				return i;
+		}
+	} else {
+		if (ipsec->num_rx_sa == IXGBE_IPSEC_MAX_SA_COUNT)
+			return -ENOSPC;
+
+		/* search tx sa table */
+		for (i = 0; i < IXGBE_IPSEC_MAX_SA_COUNT; i++) {
+			if (!ipsec->tx_tbl[i].used)
+				return i;
+		}
+	}
+
+	return -ENOSPC;
+}
+
+/**
+ * ixgbe_ipsec_parse_proto_keys - find the key and salt based on the protocol
+ * @xs: pointer to xfrm_state struct
+ * @mykey: pointer to key array to populate
+ * @mysalt: pointer to salt value to populate
+ *
+ * This copies the protocol keys and salt to our own data tables.  The
+ * 82599 family only supports the one algorithm.
+ **/
+static int ixgbe_ipsec_parse_proto_keys(struct xfrm_state *xs,
+					u32 *mykey, u32 *mysalt)
+{
+	struct net_device *dev = xs->xso.dev;
+	unsigned char *key_data;
+	char *alg_name = NULL;
+	char *aes_gcm_name = "rfc4106(gcm(aes))";
+	int key_len;
+
+	if (xs->aead) {
+		key_data = &xs->aead->alg_key[0];
+		key_len = xs->aead->alg_key_len;
+		alg_name = xs->aead->alg_name;
+	} else {
+		netdev_err(dev, "Unsupported IPsec algorithm\n");
+		return -EINVAL;
+	}
+
+	if (strcmp(alg_name, aes_gcm_name)) {
+		netdev_err(dev, "Unsupported IPsec algorithm - please use %s\n",
+			   aes_gcm_name);
+		return -EINVAL;
+	}
+
+	/* 160 accounts for 16 byte key and 4 byte salt */
+	if (key_len == 128) {
+		netdev_info(dev, "IPsec hw offload parameters missing 32 bit salt value\n");
+	} else if (key_len != 160) {
+		netdev_err(dev, "IPsec hw offload only supports keys up to 128 bits with a 32 bit salt\n");
+		return -EINVAL;
+	}
+
+	/* The key bytes come down in a bigendian array of bytes, and
+	 * salt is always the last 4 bytes of the key array.
+	 * We don't need to do any byteswapping.
+	 */
+	memcpy(mykey, key_data, 16);
+	if (key_len == 160)
+		*mysalt = ((u32 *)key_data)[4];
+	else
+		*mysalt = 0;
+
+	return 0;
+}
+
+/**
+ * ixgbe_ipsec_add_sa - program device with a security association
+ * @xs: pointer to transformer state struct
+ **/
+static int ixgbe_ipsec_add_sa(struct xfrm_state *xs)
+{
+	struct net_device *dev = xs->xso.dev;
+	struct ixgbe_adapter *adapter = netdev_priv(dev);
+	struct ixgbe_ipsec *ipsec = adapter->ipsec;
+	struct ixgbe_hw *hw = &adapter->hw;
+	int checked, match, first;
+	u16 sa_idx;
+	int ret;
+	int i;
+
+	if (xs->id.proto != IPPROTO_ESP && xs->id.proto != IPPROTO_AH) {
+		netdev_err(dev, "Unsupported protocol 0x%04x for ipsec offload\n",
+			   xs->id.proto);
+		return -EINVAL;
+	}
+
+	if (xs->xso.flags & XFRM_OFFLOAD_INBOUND) {
+		struct rx_sa rsa;
+
+		if (xs->calg) {
+			netdev_err(dev, "Compression offload not supported\n");
+			return -EINVAL;
+		}
+
+		/* find the first unused index */
+		ret = ixgbe_ipsec_find_empty_idx(ipsec, true);
+		if (ret < 0) {
+			netdev_err(dev, "No space for SA in Rx table!\n");
+			return ret;
+		}
+		sa_idx = (u16)ret;
+
+		memset(&rsa, 0, sizeof(rsa));
+		rsa.used = true;
+		rsa.xs = xs;
+
+		if (rsa.xs->id.proto & IPPROTO_ESP)
+			rsa.decrypt = xs->ealg || xs->aead;
+
+		/* get the key and salt */
+		ret = ixgbe_ipsec_parse_proto_keys(xs, rsa.key, &rsa.salt);
+		if (ret) {
+			netdev_err(dev, "Failed to get key data for Rx SA table\n");
+			return ret;
+		}
+
+		/* get ip for rx sa table */
+		if (xs->xso.flags & XFRM_OFFLOAD_IPV6)
+			memcpy(rsa.ipaddr, &xs->id.daddr.a6, 16);
+		else
+			memcpy(&rsa.ipaddr[3], &xs->id.daddr.a4, 4);
+
+		/* The HW does not have a 1:1 mapping from keys to IP addrs, so
+		 * check for a matching IP addr entry in the table.  If the addr
+		 * already exists, use it; else find an unused slot and add the
+		 * addr.  If one does not exist and there are no unused table
+		 * entries, fail the request.
+		 */
+
+		/* Find an existing match or first not used, and stop looking
+		 * after we've checked all we know we have.
+		 */
+		checked = 0;
+		match = -1;
+		first = -1;
+		for (i = 0;
+		     i < IXGBE_IPSEC_MAX_RX_IP_COUNT &&
+		     (checked < ipsec->num_rx_sa || first < 0);
+		     i++) {
+			if (ipsec->ip_tbl[i].used) {
+				if (!memcmp(ipsec->ip_tbl[i].ipaddr,
+					    rsa.ipaddr, sizeof(rsa.ipaddr))) {
+					match = i;
+					break;
+				}
+				checked++;
+			} else if (first < 0) {
+				first = i;  /* track the first empty seen */
+			}
+		}
+
+		if (ipsec->num_rx_sa == 0)
+			first = 0;
+
+		if (match >= 0) {
+			/* addrs are the same, we should use this one */
+			rsa.iptbl_ind = match;
+			ipsec->ip_tbl[match].ref_cnt++;
+
+		} else if (first >= 0) {
+			/* no matches, but here's an empty slot */
+			rsa.iptbl_ind = first;
+
+			memcpy(ipsec->ip_tbl[first].ipaddr,
+			       rsa.ipaddr, sizeof(rsa.ipaddr));
+			ipsec->ip_tbl[first].ref_cnt = 1;
+			ipsec->ip_tbl[first].used = true;
+
+			ixgbe_ipsec_set_rx_ip(hw, rsa.iptbl_ind, rsa.ipaddr);
+
+		} else {
+			/* no match and no empty slot */
+			netdev_err(dev, "No space for SA in Rx IP SA table\n");
+			memset(&rsa, 0, sizeof(rsa));
+			return -ENOSPC;
+		}
+
+		rsa.mode = IXGBE_RXMOD_VALID;
+		if (rsa.xs->id.proto & IPPROTO_ESP)
+			rsa.mode |= IXGBE_RXMOD_PROTO_ESP;
+		if (rsa.decrypt)
+			rsa.mode |= IXGBE_RXMOD_DECRYPT;
+		if (rsa.xs->xso.flags & XFRM_OFFLOAD_IPV6)
+			rsa.mode |= IXGBE_RXMOD_IPV6;
+
+		/* the preparations worked, so save the info */
+		memcpy(&ipsec->rx_tbl[sa_idx], &rsa, sizeof(rsa));
+
+		ixgbe_ipsec_set_rx_sa(hw, sa_idx, rsa.xs->id.spi, rsa.key,
+				      rsa.salt, rsa.mode, rsa.iptbl_ind);
+		xs->xso.offload_handle = sa_idx + IXGBE_IPSEC_BASE_RX_INDEX;
+
+		ipsec->num_rx_sa++;
+
+		/* hash the new entry for faster search in Rx path */
+		hash_add_rcu(ipsec->rx_sa_list, &ipsec->rx_tbl[sa_idx].hlist,
+			     rsa.xs->id.spi);
+	} else {
+		struct tx_sa tsa;
+
+		/* find the first unused index */
+		ret = ixgbe_ipsec_find_empty_idx(ipsec, false);
+		if (ret < 0) {
+			netdev_err(dev, "No space for SA in Tx table\n");
+			return ret;
+		}
+		sa_idx = (u16)ret;
+
+		memset(&tsa, 0, sizeof(tsa));
+		tsa.used = true;
+		tsa.xs = xs;
+
+		if (xs->id.proto & IPPROTO_ESP)
+			tsa.encrypt = xs->ealg || xs->aead;
+
+		ret = ixgbe_ipsec_parse_proto_keys(xs, tsa.key, &tsa.salt);
+		if (ret) {
+			netdev_err(dev, "Failed to get key data for Tx SA table\n");
+			memset(&tsa, 0, sizeof(tsa));
+			return ret;
+		}
+
+		/* the preparations worked, so save the info */
+		memcpy(&ipsec->tx_tbl[sa_idx], &tsa, sizeof(tsa));
+
+		ixgbe_ipsec_set_tx_sa(hw, sa_idx, tsa.key, tsa.salt);
+
+		xs->xso.offload_handle = sa_idx + IXGBE_IPSEC_BASE_TX_INDEX;
+
+		ipsec->num_tx_sa++;
+	}
+
+	/* enable the engine if not already warmed up */
+	if (!(adapter->flags2 & IXGBE_FLAG2_IPSEC_ENABLED)) {
+		ixgbe_ipsec_start_engine(adapter);
+		adapter->flags2 |= IXGBE_FLAG2_IPSEC_ENABLED;
+	}
+
+	return 0;
+}
+
+/**
+ * ixgbe_ipsec_del_sa - clear out this specific SA
+ * @xs: pointer to transformer state struct
+ **/
+static void ixgbe_ipsec_del_sa(struct xfrm_state *xs)
+{
+	struct net_device *dev = xs->xso.dev;
+	struct ixgbe_adapter *adapter = netdev_priv(dev);
+	struct ixgbe_ipsec *ipsec = adapter->ipsec;
+	struct ixgbe_hw *hw = &adapter->hw;
+	u32 zerobuf[4] = {0, 0, 0, 0};
+	u16 sa_idx;
+
+	if (xs->xso.flags & XFRM_OFFLOAD_INBOUND) {
+		struct rx_sa *rsa;
+		u8 ipi;
+
+		sa_idx = xs->xso.offload_handle - IXGBE_IPSEC_BASE_RX_INDEX;
+		rsa = &ipsec->rx_tbl[sa_idx];
+
+		if (!rsa->used) {
+			netdev_err(dev, "Invalid Rx SA selected sa_idx=%d offload_handle=%lu\n",
+				   sa_idx, xs->xso.offload_handle);
+			return;
+		}
+
+		ixgbe_ipsec_set_rx_sa(hw, sa_idx, 0, zerobuf, 0, 0, 0);
+		hash_del_rcu(&rsa->hlist);
+
+		/* if the IP table entry is referenced by only this SA,
+		 * i.e. ref_cnt is only 1, clear the IP table entry as well
+		 */
+		ipi = rsa->iptbl_ind;
+		if (ipsec->ip_tbl[ipi].ref_cnt > 0) {
+			ipsec->ip_tbl[ipi].ref_cnt--;
+
+			if (!ipsec->ip_tbl[ipi].ref_cnt) {
+				memset(&ipsec->ip_tbl[ipi], 0,
+				       sizeof(struct rx_ip_sa));
+				ixgbe_ipsec_set_rx_ip(hw, ipi, zerobuf);
+			}
+		}
+
+		memset(rsa, 0, sizeof(struct rx_sa));
+		ipsec->num_rx_sa--;
+	} else {
+		sa_idx = xs->xso.offload_handle - IXGBE_IPSEC_BASE_TX_INDEX;
+
+		if (!ipsec->tx_tbl[sa_idx].used) {
+			netdev_err(dev, "Invalid Tx SA selected sa_idx=%d offload_handle=%lu\n",
+				   sa_idx, xs->xso.offload_handle);
+			return;
+		}
+
+		ixgbe_ipsec_set_tx_sa(hw, sa_idx, zerobuf, 0);
+		memset(&ipsec->tx_tbl[sa_idx], 0, sizeof(struct tx_sa));
+		ipsec->num_tx_sa--;
+	}
+
+	/* if there are no SAs left, stop the engine to save energy */
+	if (ipsec->num_rx_sa == 0 && ipsec->num_tx_sa == 0) {
+		adapter->flags2 &= ~IXGBE_FLAG2_IPSEC_ENABLED;
+		ixgbe_ipsec_stop_engine(adapter);
+	}
+}
+
+static const struct xfrmdev_ops ixgbe_xfrmdev_ops = {
+	.xdo_dev_state_add = ixgbe_ipsec_add_sa,
+	.xdo_dev_state_delete = ixgbe_ipsec_del_sa,
+};
+
+/**
  * ixgbe_init_ipsec_offload - initialize security registers for IPSec operation
  * @adapter: board private structure
  **/
 void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter)
 {
+	struct ixgbe_ipsec *ipsec;
+	size_t size;
+
+	ipsec = kzalloc(sizeof(*ipsec), GFP_KERNEL);
+	if (!ipsec)
+		goto err;
+	hash_init(ipsec->rx_sa_list);
+
+	size = sizeof(struct rx_sa) * IXGBE_IPSEC_MAX_SA_COUNT;
+	ipsec->rx_tbl = kzalloc(size, GFP_KERNEL);
+	if (!ipsec->rx_tbl)
+		goto err;
+
+	size = sizeof(struct tx_sa) * IXGBE_IPSEC_MAX_SA_COUNT;
+	ipsec->tx_tbl = kzalloc(size, GFP_KERNEL);
+	if (!ipsec->tx_tbl)
+		goto err;
+
+	size = sizeof(struct rx_ip_sa) * IXGBE_IPSEC_MAX_RX_IP_COUNT;
+	ipsec->ip_tbl = kzalloc(size, GFP_KERNEL);
+	if (!ipsec->ip_tbl)
+		goto err;
+
+	ipsec->num_rx_sa = 0;
+	ipsec->num_tx_sa = 0;
+
+	adapter->ipsec = ipsec;
 	ixgbe_ipsec_clear_hw_tables(adapter);
 	ixgbe_ipsec_stop_engine(adapter);
+
+	return;
+err:
+	if (ipsec) {
+		kfree(ipsec->ip_tbl);
+		kfree(ipsec->rx_tbl);
+		kfree(ipsec->tx_tbl);
+		kfree(adapter->ipsec);
+	}
+	netdev_err(adapter->netdev, "Unable to allocate memory for SA tables");
 }
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 51fb3cf..01fd89b 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -10542,6 +10542,12 @@ static void ixgbe_remove(struct pci_dev *pdev)
 	set_bit(__IXGBE_REMOVING, &adapter->state);
 	cancel_work_sync(&adapter->service_task);
 
+#ifdef CONFIG_XFRM
+	kfree(adapter->ipsec->ip_tbl);
+	kfree(adapter->ipsec->rx_tbl);
+	kfree(adapter->ipsec->tx_tbl);
+	kfree(adapter->ipsec);
+#endif /* CONFIG_XFRM */
 
 #ifdef CONFIG_IXGBE_DCA
 	if (adapter->flags & IXGBE_FLAG_DCA_ENABLED) {
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [next-queue 06/10] ixgbe: restore offloaded SAs after a reset
  2017-12-05  5:35 ` [Intel-wired-lan] " Shannon Nelson
@ 2017-12-05  5:35   ` Shannon Nelson
  -1 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-05  5:35 UTC (permalink / raw)
  To: intel-wired-lan, jeffrey.t.kirsher
  Cc: steffen.klassert, sowmini.varadhan, netdev

On a chip reset most of the table contents are lost, so must be
restored.  This scans the driver's ipsec tables and restores both
the filled and empty table slots to their pre-reset values.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe.h       |  2 +
 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 53 ++++++++++++++++++++++++++
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  |  1 +
 3 files changed, 56 insertions(+)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index 9487750..7e8bca7 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -1009,7 +1009,9 @@ s32 ixgbe_negotiate_fc(struct ixgbe_hw *hw, u32 adv_reg, u32 lp_reg,
 		       u32 adv_sym, u32 adv_asm, u32 lp_sym, u32 lp_asm);
 #ifdef CONFIG_XFRM_OFFLOAD
 void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter);
+void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter);
 #else
 static inline void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter) { };
+static inline void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter) { };
 #endif /* CONFIG_XFRM_OFFLOAD */
 #endif /* _IXGBE_H_ */
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
index 7b01d92..b93ee7f 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
@@ -292,6 +292,59 @@ static void ixgbe_ipsec_start_engine(struct ixgbe_adapter *adapter)
 }
 
 /**
+ * ixgbe_ipsec_restore - restore the ipsec HW settings after a reset
+ * @adapter: board private structure
+ **/
+void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter)
+{
+	struct ixgbe_ipsec *ipsec = adapter->ipsec;
+	struct ixgbe_hw *hw = &adapter->hw;
+	u32 zbuf[4] = {0, 0, 0, 0};
+	int i;
+
+	if (!(adapter->flags2 & IXGBE_FLAG2_IPSEC_ENABLED))
+		return;
+
+	/* clean up the engine settings */
+	ixgbe_ipsec_stop_engine(adapter);
+
+	/* start the engine */
+	ixgbe_ipsec_start_engine(adapter);
+
+	/* reload the IP addrs */
+	for (i = 0; i < IXGBE_IPSEC_MAX_RX_IP_COUNT; i++) {
+		struct rx_ip_sa *ipsa = &ipsec->ip_tbl[i];
+
+		if (ipsa->used)
+			ixgbe_ipsec_set_rx_ip(hw, i, ipsa->ipaddr);
+		else
+			ixgbe_ipsec_set_rx_ip(hw, i, zbuf);
+	}
+
+	/* reload the Rx keys */
+	for (i = 0; i < IXGBE_IPSEC_MAX_SA_COUNT; i++) {
+		struct rx_sa *rsa = &ipsec->rx_tbl[i];
+
+		if (rsa->used)
+			ixgbe_ipsec_set_rx_sa(hw, i, rsa->xs->id.spi,
+					      rsa->key, rsa->salt,
+					      rsa->mode, rsa->iptbl_ind);
+		else
+			ixgbe_ipsec_set_rx_sa(hw, i, 0, zbuf, 0, 0, 0);
+	}
+
+	/* reload the Tx keys */
+	for (i = 0; i < IXGBE_IPSEC_MAX_SA_COUNT; i++) {
+		struct tx_sa *tsa = &ipsec->tx_tbl[i];
+
+		if (tsa->used)
+			ixgbe_ipsec_set_tx_sa(hw, i, tsa->key, tsa->salt);
+		else
+			ixgbe_ipsec_set_tx_sa(hw, i, zbuf, 0);
+	}
+}
+
+/**
  * ixgbe_ipsec_find_empty_idx - find the first unused security parameter index
  * @ipsec: pointer to ipsec struct
  * @rxtable: true if we need to look in the Rx table
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 01fd89b..6eabf92 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -5347,6 +5347,7 @@ static void ixgbe_configure(struct ixgbe_adapter *adapter)
 
 	ixgbe_set_rx_mode(adapter->netdev);
 	ixgbe_restore_vlan(adapter);
+	ixgbe_ipsec_restore(adapter);
 
 	switch (hw->mac.type) {
 	case ixgbe_mac_82599EB:
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 06/10] ixgbe: restore offloaded SAs after a reset
@ 2017-12-05  5:35   ` Shannon Nelson
  0 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-05  5:35 UTC (permalink / raw)
  To: intel-wired-lan

On a chip reset most of the table contents are lost, so must be
restored.  This scans the driver's ipsec tables and restores both
the filled and empty table slots to their pre-reset values.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe.h       |  2 +
 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 53 ++++++++++++++++++++++++++
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  |  1 +
 3 files changed, 56 insertions(+)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index 9487750..7e8bca7 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -1009,7 +1009,9 @@ s32 ixgbe_negotiate_fc(struct ixgbe_hw *hw, u32 adv_reg, u32 lp_reg,
 		       u32 adv_sym, u32 adv_asm, u32 lp_sym, u32 lp_asm);
 #ifdef CONFIG_XFRM_OFFLOAD
 void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter);
+void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter);
 #else
 static inline void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter) { };
+static inline void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter) { };
 #endif /* CONFIG_XFRM_OFFLOAD */
 #endif /* _IXGBE_H_ */
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
index 7b01d92..b93ee7f 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
@@ -292,6 +292,59 @@ static void ixgbe_ipsec_start_engine(struct ixgbe_adapter *adapter)
 }
 
 /**
+ * ixgbe_ipsec_restore - restore the ipsec HW settings after a reset
+ * @adapter: board private structure
+ **/
+void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter)
+{
+	struct ixgbe_ipsec *ipsec = adapter->ipsec;
+	struct ixgbe_hw *hw = &adapter->hw;
+	u32 zbuf[4] = {0, 0, 0, 0};
+	int i;
+
+	if (!(adapter->flags2 & IXGBE_FLAG2_IPSEC_ENABLED))
+		return;
+
+	/* clean up the engine settings */
+	ixgbe_ipsec_stop_engine(adapter);
+
+	/* start the engine */
+	ixgbe_ipsec_start_engine(adapter);
+
+	/* reload the IP addrs */
+	for (i = 0; i < IXGBE_IPSEC_MAX_RX_IP_COUNT; i++) {
+		struct rx_ip_sa *ipsa = &ipsec->ip_tbl[i];
+
+		if (ipsa->used)
+			ixgbe_ipsec_set_rx_ip(hw, i, ipsa->ipaddr);
+		else
+			ixgbe_ipsec_set_rx_ip(hw, i, zbuf);
+	}
+
+	/* reload the Rx keys */
+	for (i = 0; i < IXGBE_IPSEC_MAX_SA_COUNT; i++) {
+		struct rx_sa *rsa = &ipsec->rx_tbl[i];
+
+		if (rsa->used)
+			ixgbe_ipsec_set_rx_sa(hw, i, rsa->xs->id.spi,
+					      rsa->key, rsa->salt,
+					      rsa->mode, rsa->iptbl_ind);
+		else
+			ixgbe_ipsec_set_rx_sa(hw, i, 0, zbuf, 0, 0, 0);
+	}
+
+	/* reload the Tx keys */
+	for (i = 0; i < IXGBE_IPSEC_MAX_SA_COUNT; i++) {
+		struct tx_sa *tsa = &ipsec->tx_tbl[i];
+
+		if (tsa->used)
+			ixgbe_ipsec_set_tx_sa(hw, i, tsa->key, tsa->salt);
+		else
+			ixgbe_ipsec_set_tx_sa(hw, i, zbuf, 0);
+	}
+}
+
+/**
  * ixgbe_ipsec_find_empty_idx - find the first unused security parameter index
  * @ipsec: pointer to ipsec struct
  * @rxtable: true if we need to look in the Rx table
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 01fd89b..6eabf92 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -5347,6 +5347,7 @@ static void ixgbe_configure(struct ixgbe_adapter *adapter)
 
 	ixgbe_set_rx_mode(adapter->netdev);
 	ixgbe_restore_vlan(adapter);
+	ixgbe_ipsec_restore(adapter);
 
 	switch (hw->mac.type) {
 	case ixgbe_mac_82599EB:
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [next-queue 07/10] ixgbe: process the Rx ipsec offload
  2017-12-05  5:35 ` [Intel-wired-lan] " Shannon Nelson
@ 2017-12-05  5:35   ` Shannon Nelson
  -1 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-05  5:35 UTC (permalink / raw)
  To: intel-wired-lan, jeffrey.t.kirsher
  Cc: steffen.klassert, sowmini.varadhan, netdev

If the chip sees and decrypts an ipsec offload, set up the skb
sp pointer with the ralated SA info.  Since the chip is rude
enough to keep to itself the table index it used for the
decryption, we have to do our own table lookup, using the
hash for speed.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe.h       |  6 ++
 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 89 ++++++++++++++++++++++++++
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  |  3 +
 3 files changed, 98 insertions(+)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index 7e8bca7..77f07dc 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -1009,9 +1009,15 @@ s32 ixgbe_negotiate_fc(struct ixgbe_hw *hw, u32 adv_reg, u32 lp_reg,
 		       u32 adv_sym, u32 adv_asm, u32 lp_sym, u32 lp_asm);
 #ifdef CONFIG_XFRM_OFFLOAD
 void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter);
+void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
+		    union ixgbe_adv_rx_desc *rx_desc,
+		    struct sk_buff *skb);
 void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter);
 #else
 static inline void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter) { };
+static inline void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
+				  union ixgbe_adv_rx_desc *rx_desc,
+				  struct sk_buff *skb) { };
 static inline void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter) { };
 #endif /* CONFIG_XFRM_OFFLOAD */
 #endif /* _IXGBE_H_ */
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
index b93ee7f..fd06d9b 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
@@ -379,6 +379,35 @@ static int ixgbe_ipsec_find_empty_idx(struct ixgbe_ipsec *ipsec, bool rxtable)
 }
 
 /**
+ * ixgbe_ipsec_find_rx_state - find the state that matches
+ * @ipsec: pointer to ipsec struct
+ * @daddr: inbound address to match
+ * @proto: protocol to match
+ * @spi: SPI to match
+ *
+ * Returns a pointer to the matching SA state information
+ **/
+static struct xfrm_state *ixgbe_ipsec_find_rx_state(struct ixgbe_ipsec *ipsec,
+						    __be32 daddr, u8 proto,
+						    __be32 spi)
+{
+	struct rx_sa *rsa;
+	struct xfrm_state *ret = NULL;
+
+	rcu_read_lock();
+	hash_for_each_possible_rcu(ipsec->rx_sa_list, rsa, hlist, spi)
+		if (spi == rsa->xs->id.spi &&
+		    daddr == rsa->xs->id.daddr.a4 &&
+		    proto == rsa->xs->id.proto) {
+			ret = rsa->xs;
+			xfrm_state_hold(ret);
+			break;
+		}
+	rcu_read_unlock();
+	return ret;
+}
+
+/**
  * ixgbe_ipsec_parse_proto_keys - find the key and salt based on the protocol
  * @xs: pointer to xfrm_state struct
  * @mykey: pointer to key array to populate
@@ -680,6 +709,66 @@ static const struct xfrmdev_ops ixgbe_xfrmdev_ops = {
 };
 
 /**
+ * ixgbe_ipsec_rx - decode ipsec bits from Rx descriptor
+ * @rx_ring: receiving ring
+ * @rx_desc: receive data descriptor
+ * @skb: current data packet
+ *
+ * Determine if there was an ipsec encapsulation noticed, and if so set up
+ * the resulting status for later in the receive stack.
+ **/
+void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
+		    union ixgbe_adv_rx_desc *rx_desc,
+		    struct sk_buff *skb)
+{
+	struct ixgbe_adapter *adapter = netdev_priv(rx_ring->netdev);
+	u16 pkt_info = le16_to_cpu(rx_desc->wb.lower.lo_dword.hs_rss.pkt_info);
+	u16 ipsec_pkt_types = IXGBE_RXDADV_PKTTYPE_IPSEC_AH |
+				IXGBE_RXDADV_PKTTYPE_IPSEC_ESP;
+	struct ixgbe_ipsec *ipsec = adapter->ipsec;
+	struct xfrm_offload *xo = NULL;
+	struct xfrm_state *xs = NULL;
+	struct iphdr *iph;
+	u8 *c_hdr;
+	__be32 spi;
+	u8 proto;
+
+	/* we can assume no vlan header in the way, b/c the
+	 * hw won't recognize the IPsec packet and anyway the
+	 * currently vlan device doesn't support xfrm offload.
+	 */
+	/* TODO: not supporting IPv6 yet */
+	iph = (struct iphdr *)(skb->data + ETH_HLEN);
+	c_hdr = (u8 *)iph + iph->ihl * 4;
+	switch (pkt_info & ipsec_pkt_types) {
+	case IXGBE_RXDADV_PKTTYPE_IPSEC_AH:
+		spi = ((struct ip_auth_hdr *)c_hdr)->spi;
+		proto = IPPROTO_AH;
+		break;
+	case IXGBE_RXDADV_PKTTYPE_IPSEC_ESP:
+		spi = ((struct ip_esp_hdr *)c_hdr)->spi;
+		proto = IPPROTO_ESP;
+		break;
+	default:
+		return;
+	}
+
+	xs = ixgbe_ipsec_find_rx_state(ipsec, iph->daddr, proto, spi);
+	if (unlikely(!xs))
+		return;
+
+	skb->sp = secpath_dup(skb->sp);
+	if (unlikely(!skb->sp))
+		return;
+
+	skb->sp->xvec[skb->sp->len++] = xs;
+	skb->sp->olen++;
+	xo = xfrm_offload(skb);
+	xo->flags = CRYPTO_DONE;
+	xo->status = CRYPTO_SUCCESS;
+}
+
+/**
  * ixgbe_init_ipsec_offload - initialize security registers for IPSec operation
  * @adapter: board private structure
  **/
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 6eabf92..60f9f2d 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -1755,6 +1755,9 @@ static void ixgbe_process_skb_fields(struct ixgbe_ring *rx_ring,
 
 	skb_record_rx_queue(skb, rx_ring->queue_index);
 
+	if (ixgbe_test_staterr(rx_desc, IXGBE_RXDADV_STAT_SECP))
+		ixgbe_ipsec_rx(rx_ring, rx_desc, skb);
+
 	skb->protocol = eth_type_trans(skb, dev);
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 07/10] ixgbe: process the Rx ipsec offload
@ 2017-12-05  5:35   ` Shannon Nelson
  0 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-05  5:35 UTC (permalink / raw)
  To: intel-wired-lan

If the chip sees and decrypts an ipsec offload, set up the skb
sp pointer with the ralated SA info.  Since the chip is rude
enough to keep to itself the table index it used for the
decryption, we have to do our own table lookup, using the
hash for speed.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe.h       |  6 ++
 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 89 ++++++++++++++++++++++++++
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  |  3 +
 3 files changed, 98 insertions(+)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index 7e8bca7..77f07dc 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -1009,9 +1009,15 @@ s32 ixgbe_negotiate_fc(struct ixgbe_hw *hw, u32 adv_reg, u32 lp_reg,
 		       u32 adv_sym, u32 adv_asm, u32 lp_sym, u32 lp_asm);
 #ifdef CONFIG_XFRM_OFFLOAD
 void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter);
+void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
+		    union ixgbe_adv_rx_desc *rx_desc,
+		    struct sk_buff *skb);
 void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter);
 #else
 static inline void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter) { };
+static inline void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
+				  union ixgbe_adv_rx_desc *rx_desc,
+				  struct sk_buff *skb) { };
 static inline void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter) { };
 #endif /* CONFIG_XFRM_OFFLOAD */
 #endif /* _IXGBE_H_ */
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
index b93ee7f..fd06d9b 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
@@ -379,6 +379,35 @@ static int ixgbe_ipsec_find_empty_idx(struct ixgbe_ipsec *ipsec, bool rxtable)
 }
 
 /**
+ * ixgbe_ipsec_find_rx_state - find the state that matches
+ * @ipsec: pointer to ipsec struct
+ * @daddr: inbound address to match
+ * @proto: protocol to match
+ * @spi: SPI to match
+ *
+ * Returns a pointer to the matching SA state information
+ **/
+static struct xfrm_state *ixgbe_ipsec_find_rx_state(struct ixgbe_ipsec *ipsec,
+						    __be32 daddr, u8 proto,
+						    __be32 spi)
+{
+	struct rx_sa *rsa;
+	struct xfrm_state *ret = NULL;
+
+	rcu_read_lock();
+	hash_for_each_possible_rcu(ipsec->rx_sa_list, rsa, hlist, spi)
+		if (spi == rsa->xs->id.spi &&
+		    daddr == rsa->xs->id.daddr.a4 &&
+		    proto == rsa->xs->id.proto) {
+			ret = rsa->xs;
+			xfrm_state_hold(ret);
+			break;
+		}
+	rcu_read_unlock();
+	return ret;
+}
+
+/**
  * ixgbe_ipsec_parse_proto_keys - find the key and salt based on the protocol
  * @xs: pointer to xfrm_state struct
  * @mykey: pointer to key array to populate
@@ -680,6 +709,66 @@ static const struct xfrmdev_ops ixgbe_xfrmdev_ops = {
 };
 
 /**
+ * ixgbe_ipsec_rx - decode ipsec bits from Rx descriptor
+ * @rx_ring: receiving ring
+ * @rx_desc: receive data descriptor
+ * @skb: current data packet
+ *
+ * Determine if there was an ipsec encapsulation noticed, and if so set up
+ * the resulting status for later in the receive stack.
+ **/
+void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
+		    union ixgbe_adv_rx_desc *rx_desc,
+		    struct sk_buff *skb)
+{
+	struct ixgbe_adapter *adapter = netdev_priv(rx_ring->netdev);
+	u16 pkt_info = le16_to_cpu(rx_desc->wb.lower.lo_dword.hs_rss.pkt_info);
+	u16 ipsec_pkt_types = IXGBE_RXDADV_PKTTYPE_IPSEC_AH |
+				IXGBE_RXDADV_PKTTYPE_IPSEC_ESP;
+	struct ixgbe_ipsec *ipsec = adapter->ipsec;
+	struct xfrm_offload *xo = NULL;
+	struct xfrm_state *xs = NULL;
+	struct iphdr *iph;
+	u8 *c_hdr;
+	__be32 spi;
+	u8 proto;
+
+	/* we can assume no vlan header in the way, b/c the
+	 * hw won't recognize the IPsec packet and anyway the
+	 * currently vlan device doesn't support xfrm offload.
+	 */
+	/* TODO: not supporting IPv6 yet */
+	iph = (struct iphdr *)(skb->data + ETH_HLEN);
+	c_hdr = (u8 *)iph + iph->ihl * 4;
+	switch (pkt_info & ipsec_pkt_types) {
+	case IXGBE_RXDADV_PKTTYPE_IPSEC_AH:
+		spi = ((struct ip_auth_hdr *)c_hdr)->spi;
+		proto = IPPROTO_AH;
+		break;
+	case IXGBE_RXDADV_PKTTYPE_IPSEC_ESP:
+		spi = ((struct ip_esp_hdr *)c_hdr)->spi;
+		proto = IPPROTO_ESP;
+		break;
+	default:
+		return;
+	}
+
+	xs = ixgbe_ipsec_find_rx_state(ipsec, iph->daddr, proto, spi);
+	if (unlikely(!xs))
+		return;
+
+	skb->sp = secpath_dup(skb->sp);
+	if (unlikely(!skb->sp))
+		return;
+
+	skb->sp->xvec[skb->sp->len++] = xs;
+	skb->sp->olen++;
+	xo = xfrm_offload(skb);
+	xo->flags = CRYPTO_DONE;
+	xo->status = CRYPTO_SUCCESS;
+}
+
+/**
  * ixgbe_init_ipsec_offload - initialize security registers for IPSec operation
  * @adapter: board private structure
  **/
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 6eabf92..60f9f2d 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -1755,6 +1755,9 @@ static void ixgbe_process_skb_fields(struct ixgbe_ring *rx_ring,
 
 	skb_record_rx_queue(skb, rx_ring->queue_index);
 
+	if (ixgbe_test_staterr(rx_desc, IXGBE_RXDADV_STAT_SECP))
+		ixgbe_ipsec_rx(rx_ring, rx_desc, skb);
+
 	skb->protocol = eth_type_trans(skb, dev);
 }
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [next-queue 08/10] ixgbe: process the Tx ipsec offload
  2017-12-05  5:35 ` [Intel-wired-lan] " Shannon Nelson
@ 2017-12-05  5:35   ` Shannon Nelson
  -1 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-05  5:35 UTC (permalink / raw)
  To: intel-wired-lan, jeffrey.t.kirsher
  Cc: steffen.klassert, sowmini.varadhan, netdev

If the skb has a security association referenced in the skb, then
set up the Tx descriptor with the ipsec offload bits.  While we're
here, we fix an oddly named field in the context descriptor struct.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe.h       | 10 +++-
 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 77 ++++++++++++++++++++++++++
 drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c   |  4 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  | 38 ++++++++++---
 drivers/net/ethernet/intel/ixgbe/ixgbe_type.h  |  2 +-
 5 files changed, 118 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index 77f07dc..68097fe 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -171,10 +171,11 @@ enum ixgbe_tx_flags {
 	IXGBE_TX_FLAGS_CC	= 0x08,
 	IXGBE_TX_FLAGS_IPV4	= 0x10,
 	IXGBE_TX_FLAGS_CSUM	= 0x20,
+	IXGBE_TX_FLAGS_IPSEC	= 0x40,
 
 	/* software defined flags */
-	IXGBE_TX_FLAGS_SW_VLAN	= 0x40,
-	IXGBE_TX_FLAGS_FCOE	= 0x80,
+	IXGBE_TX_FLAGS_SW_VLAN	= 0x80,
+	IXGBE_TX_FLAGS_FCOE	= 0x100,
 };
 
 /* VLAN info */
@@ -1012,12 +1013,17 @@ void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter);
 void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
 		    union ixgbe_adv_rx_desc *rx_desc,
 		    struct sk_buff *skb);
+int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring, struct sk_buff *skb,
+		   __be16 protocol, struct ixgbe_ipsec_tx_data *itd);
 void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter);
 #else
 static inline void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter) { };
 static inline void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
 				  union ixgbe_adv_rx_desc *rx_desc,
 				  struct sk_buff *skb) { };
+static inline int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring,
+				 struct sk_buff *skb, __be16 protocol,
+				 struct ixgbe_ipsec_tx_data *itd) { return 0; };
 static inline void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter) { };
 #endif /* CONFIG_XFRM_OFFLOAD */
 #endif /* _IXGBE_H_ */
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
index fd06d9b..2a0dd7a 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
@@ -703,12 +703,89 @@ static void ixgbe_ipsec_del_sa(struct xfrm_state *xs)
 	}
 }
 
+/**
+ * ixgbe_ipsec_offload_ok - can this packet use the xfrm hw offload
+ * @skb: current data packet
+ * @xs: pointer to transformer state struct
+ **/
+static bool ixgbe_ipsec_offload_ok(struct sk_buff *skb, struct xfrm_state *xs)
+{
+	if (xs->props.family == AF_INET) {
+		/* Offload with IPv4 options is not supported yet */
+		if (ip_hdr(skb)->ihl > 5)
+			return false;
+	} else {
+		/* Offload with IPv6 extension headers is not support yet */
+		if (ipv6_ext_hdr(ipv6_hdr(skb)->nexthdr))
+			return false;
+	}
+
+	return true;
+}
+
 static const struct xfrmdev_ops ixgbe_xfrmdev_ops = {
 	.xdo_dev_state_add = ixgbe_ipsec_add_sa,
 	.xdo_dev_state_delete = ixgbe_ipsec_del_sa,
+	.xdo_dev_offload_ok = ixgbe_ipsec_offload_ok,
 };
 
 /**
+ * ixgbe_ipsec_tx - setup Tx flags for ipsec offload
+ * @tx_ring: outgoing context
+ * @skb: current data packet
+ * @protocol: network protocol
+ * @itd: ipsec Tx data for later use in building context descriptor
+ **/
+int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring, struct sk_buff *skb,
+		   __be16 protocol, struct ixgbe_ipsec_tx_data *itd)
+{
+	struct ixgbe_adapter *adapter = netdev_priv(tx_ring->netdev);
+	struct ixgbe_ipsec *ipsec = adapter->ipsec;
+	struct xfrm_state *xs;
+	struct tx_sa *tsa;
+
+	if (!skb->sp->len) {
+		netdev_err(tx_ring->netdev, "%s: no xfrm state len = %d\n",
+			   __func__, skb->sp->len);
+		return 0;
+	}
+
+	xs = xfrm_input_state(skb);
+	if (!xs) {
+		netdev_err(tx_ring->netdev, "%s: no xfrm_input_state() xs = %p\n",
+			   __func__, xs);
+		return 0;
+	}
+
+	itd->sa_idx = xs->xso.offload_handle - IXGBE_IPSEC_BASE_TX_INDEX;
+	if (itd->sa_idx > IXGBE_IPSEC_MAX_SA_COUNT) {
+		netdev_err(tx_ring->netdev, "%s: bad sa_idx=%d handle=%lu\n",
+			   __func__, itd->sa_idx, xs->xso.offload_handle);
+		return 0;
+	}
+
+	tsa = &ipsec->tx_tbl[itd->sa_idx];
+	if (!tsa->used) {
+		netdev_err(tx_ring->netdev, "%s: unused sa_idx=%d\n",
+			   __func__, itd->sa_idx);
+		return 0;
+	}
+
+	itd->flags = 0;
+	if (xs->id.proto == IPPROTO_ESP) {
+		itd->flags |= IXGBE_ADVTXD_TUCMD_IPSEC_TYPE_ESP |
+			      IXGBE_ADVTXD_TUCMD_L4T_TCP;
+		if (protocol == htons(ETH_P_IP))
+			itd->flags |= IXGBE_ADVTXD_TUCMD_IPV4;
+		itd->trailer_len = xs->props.trailer_len;
+	}
+	if (tsa->encrypt)
+		itd->flags |= IXGBE_ADVTXD_TUCMD_IPSEC_ENCRYPT_EN;
+
+	return 1;
+}
+
+/**
  * ixgbe_ipsec_rx - decode ipsec bits from Rx descriptor
  * @rx_ring: receiving ring
  * @rx_desc: receive data descriptor
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
index f1bfae0..d7875b3 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
@@ -1261,7 +1261,7 @@ void ixgbe_clear_interrupt_scheme(struct ixgbe_adapter *adapter)
 }
 
 void ixgbe_tx_ctxtdesc(struct ixgbe_ring *tx_ring, u32 vlan_macip_lens,
-		       u32 fcoe_sof_eof, u32 type_tucmd, u32 mss_l4len_idx)
+		       u32 fceof_saidx, u32 type_tucmd, u32 mss_l4len_idx)
 {
 	struct ixgbe_adv_tx_context_desc *context_desc;
 	u16 i = tx_ring->next_to_use;
@@ -1275,7 +1275,7 @@ void ixgbe_tx_ctxtdesc(struct ixgbe_ring *tx_ring, u32 vlan_macip_lens,
 	type_tucmd |= IXGBE_TXD_CMD_DEXT | IXGBE_ADVTXD_DTYP_CTXT;
 
 	context_desc->vlan_macip_lens	= cpu_to_le32(vlan_macip_lens);
-	context_desc->seqnum_seed	= cpu_to_le32(fcoe_sof_eof);
+	context_desc->fceof_saidx	= cpu_to_le32(fceof_saidx);
 	context_desc->type_tucmd_mlhl	= cpu_to_le32(type_tucmd);
 	context_desc->mss_l4len_idx	= cpu_to_le32(mss_l4len_idx);
 }
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 60f9f2d..c857594 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -7659,9 +7659,10 @@ static void ixgbe_service_task(struct work_struct *work)
 
 static int ixgbe_tso(struct ixgbe_ring *tx_ring,
 		     struct ixgbe_tx_buffer *first,
-		     u8 *hdr_len)
+		     u8 *hdr_len,
+		     struct ixgbe_ipsec_tx_data *itd)
 {
-	u32 vlan_macip_lens, type_tucmd, mss_l4len_idx;
+	u32 vlan_macip_lens, type_tucmd, mss_l4len_idx, fceof_saidx = 0;
 	struct sk_buff *skb = first->skb;
 	union {
 		struct iphdr *v4;
@@ -7740,7 +7741,12 @@ static int ixgbe_tso(struct ixgbe_ring *tx_ring,
 	vlan_macip_lens |= (ip.hdr - skb->data) << IXGBE_ADVTXD_MACLEN_SHIFT;
 	vlan_macip_lens |= first->tx_flags & IXGBE_TX_FLAGS_VLAN_MASK;
 
-	ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, 0, type_tucmd,
+	if (first->tx_flags & IXGBE_TX_FLAGS_IPSEC) {
+		fceof_saidx |= itd->sa_idx;
+		type_tucmd |= itd->flags | itd->trailer_len;
+	}
+
+	ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, fceof_saidx, type_tucmd,
 			  mss_l4len_idx);
 
 	return 1;
@@ -7756,10 +7762,12 @@ static inline bool ixgbe_ipv6_csum_is_sctp(struct sk_buff *skb)
 }
 
 static void ixgbe_tx_csum(struct ixgbe_ring *tx_ring,
-			  struct ixgbe_tx_buffer *first)
+			  struct ixgbe_tx_buffer *first,
+			  struct ixgbe_ipsec_tx_data *itd)
 {
 	struct sk_buff *skb = first->skb;
 	u32 vlan_macip_lens = 0;
+	u32 fceof_saidx = 0;
 	u32 type_tucmd = 0;
 
 	if (skb->ip_summed != CHECKSUM_PARTIAL) {
@@ -7800,7 +7808,12 @@ static void ixgbe_tx_csum(struct ixgbe_ring *tx_ring,
 	vlan_macip_lens |= skb_network_offset(skb) << IXGBE_ADVTXD_MACLEN_SHIFT;
 	vlan_macip_lens |= first->tx_flags & IXGBE_TX_FLAGS_VLAN_MASK;
 
-	ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, 0, type_tucmd, 0);
+	if (first->tx_flags & IXGBE_TX_FLAGS_IPSEC) {
+		fceof_saidx |= itd->sa_idx;
+		type_tucmd |= itd->flags | itd->trailer_len;
+	}
+
+	ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, fceof_saidx, type_tucmd, 0);
 }
 
 #define IXGBE_SET_FLAG(_input, _flag, _result) \
@@ -7843,11 +7856,16 @@ static void ixgbe_tx_olinfo_status(union ixgbe_adv_tx_desc *tx_desc,
 					IXGBE_TX_FLAGS_CSUM,
 					IXGBE_ADVTXD_POPTS_TXSM);
 
-	/* enble IPv4 checksum for TSO */
+	/* enable IPv4 checksum for TSO */
 	olinfo_status |= IXGBE_SET_FLAG(tx_flags,
 					IXGBE_TX_FLAGS_IPV4,
 					IXGBE_ADVTXD_POPTS_IXSM);
 
+	/* enable IPsec */
+	olinfo_status |= IXGBE_SET_FLAG(tx_flags,
+					IXGBE_TX_FLAGS_IPSEC,
+					IXGBE_ADVTXD_POPTS_IPSEC);
+
 	/*
 	 * Check Context must be set if Tx switch is enabled, which it
 	 * always is for case where virtual functions are running
@@ -8306,6 +8324,7 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff *skb,
 	u32 tx_flags = 0;
 	unsigned short f;
 	u16 count = TXD_USE_COUNT(skb_headlen(skb));
+	struct ixgbe_ipsec_tx_data ipsec_tx = { 0 };
 	__be16 protocol = skb->protocol;
 	u8 hdr_len = 0;
 
@@ -8394,6 +8413,9 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff *skb,
 		}
 	}
 
+	if (skb->sp && ixgbe_ipsec_tx(tx_ring, skb, protocol, &ipsec_tx))
+		tx_flags |= IXGBE_TX_FLAGS_IPSEC | IXGBE_TX_FLAGS_CC;
+
 	/* record initial flags and protocol */
 	first->tx_flags = tx_flags;
 	first->protocol = protocol;
@@ -8410,11 +8432,11 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff *skb,
 	}
 
 #endif /* IXGBE_FCOE */
-	tso = ixgbe_tso(tx_ring, first, &hdr_len);
+	tso = ixgbe_tso(tx_ring, first, &hdr_len, &ipsec_tx);
 	if (tso < 0)
 		goto out_drop;
 	else if (!tso)
-		ixgbe_tx_csum(tx_ring, first);
+		ixgbe_tx_csum(tx_ring, first, &ipsec_tx);
 
 	/* add the ATR filter if ATR is on */
 	if (test_bit(__IXGBE_TX_FDIR_INIT_DONE, &tx_ring->state))
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
index 3df0763..0ac725fa 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
@@ -2856,7 +2856,7 @@ union ixgbe_adv_rx_desc {
 /* Context descriptors */
 struct ixgbe_adv_tx_context_desc {
 	__le32 vlan_macip_lens;
-	__le32 seqnum_seed;
+	__le32 fceof_saidx;
 	__le32 type_tucmd_mlhl;
 	__le32 mss_l4len_idx;
 };
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 08/10] ixgbe: process the Tx ipsec offload
@ 2017-12-05  5:35   ` Shannon Nelson
  0 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-05  5:35 UTC (permalink / raw)
  To: intel-wired-lan

If the skb has a security association referenced in the skb, then
set up the Tx descriptor with the ipsec offload bits.  While we're
here, we fix an oddly named field in the context descriptor struct.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe.h       | 10 +++-
 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 77 ++++++++++++++++++++++++++
 drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c   |  4 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  | 38 ++++++++++---
 drivers/net/ethernet/intel/ixgbe/ixgbe_type.h  |  2 +-
 5 files changed, 118 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index 77f07dc..68097fe 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -171,10 +171,11 @@ enum ixgbe_tx_flags {
 	IXGBE_TX_FLAGS_CC	= 0x08,
 	IXGBE_TX_FLAGS_IPV4	= 0x10,
 	IXGBE_TX_FLAGS_CSUM	= 0x20,
+	IXGBE_TX_FLAGS_IPSEC	= 0x40,
 
 	/* software defined flags */
-	IXGBE_TX_FLAGS_SW_VLAN	= 0x40,
-	IXGBE_TX_FLAGS_FCOE	= 0x80,
+	IXGBE_TX_FLAGS_SW_VLAN	= 0x80,
+	IXGBE_TX_FLAGS_FCOE	= 0x100,
 };
 
 /* VLAN info */
@@ -1012,12 +1013,17 @@ void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter);
 void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
 		    union ixgbe_adv_rx_desc *rx_desc,
 		    struct sk_buff *skb);
+int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring, struct sk_buff *skb,
+		   __be16 protocol, struct ixgbe_ipsec_tx_data *itd);
 void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter);
 #else
 static inline void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter) { };
 static inline void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
 				  union ixgbe_adv_rx_desc *rx_desc,
 				  struct sk_buff *skb) { };
+static inline int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring,
+				 struct sk_buff *skb, __be16 protocol,
+				 struct ixgbe_ipsec_tx_data *itd) { return 0; };
 static inline void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter) { };
 #endif /* CONFIG_XFRM_OFFLOAD */
 #endif /* _IXGBE_H_ */
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
index fd06d9b..2a0dd7a 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
@@ -703,12 +703,89 @@ static void ixgbe_ipsec_del_sa(struct xfrm_state *xs)
 	}
 }
 
+/**
+ * ixgbe_ipsec_offload_ok - can this packet use the xfrm hw offload
+ * @skb: current data packet
+ * @xs: pointer to transformer state struct
+ **/
+static bool ixgbe_ipsec_offload_ok(struct sk_buff *skb, struct xfrm_state *xs)
+{
+	if (xs->props.family == AF_INET) {
+		/* Offload with IPv4 options is not supported yet */
+		if (ip_hdr(skb)->ihl > 5)
+			return false;
+	} else {
+		/* Offload with IPv6 extension headers is not support yet */
+		if (ipv6_ext_hdr(ipv6_hdr(skb)->nexthdr))
+			return false;
+	}
+
+	return true;
+}
+
 static const struct xfrmdev_ops ixgbe_xfrmdev_ops = {
 	.xdo_dev_state_add = ixgbe_ipsec_add_sa,
 	.xdo_dev_state_delete = ixgbe_ipsec_del_sa,
+	.xdo_dev_offload_ok = ixgbe_ipsec_offload_ok,
 };
 
 /**
+ * ixgbe_ipsec_tx - setup Tx flags for ipsec offload
+ * @tx_ring: outgoing context
+ * @skb: current data packet
+ * @protocol: network protocol
+ * @itd: ipsec Tx data for later use in building context descriptor
+ **/
+int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring, struct sk_buff *skb,
+		   __be16 protocol, struct ixgbe_ipsec_tx_data *itd)
+{
+	struct ixgbe_adapter *adapter = netdev_priv(tx_ring->netdev);
+	struct ixgbe_ipsec *ipsec = adapter->ipsec;
+	struct xfrm_state *xs;
+	struct tx_sa *tsa;
+
+	if (!skb->sp->len) {
+		netdev_err(tx_ring->netdev, "%s: no xfrm state len = %d\n",
+			   __func__, skb->sp->len);
+		return 0;
+	}
+
+	xs = xfrm_input_state(skb);
+	if (!xs) {
+		netdev_err(tx_ring->netdev, "%s: no xfrm_input_state() xs = %p\n",
+			   __func__, xs);
+		return 0;
+	}
+
+	itd->sa_idx = xs->xso.offload_handle - IXGBE_IPSEC_BASE_TX_INDEX;
+	if (itd->sa_idx > IXGBE_IPSEC_MAX_SA_COUNT) {
+		netdev_err(tx_ring->netdev, "%s: bad sa_idx=%d handle=%lu\n",
+			   __func__, itd->sa_idx, xs->xso.offload_handle);
+		return 0;
+	}
+
+	tsa = &ipsec->tx_tbl[itd->sa_idx];
+	if (!tsa->used) {
+		netdev_err(tx_ring->netdev, "%s: unused sa_idx=%d\n",
+			   __func__, itd->sa_idx);
+		return 0;
+	}
+
+	itd->flags = 0;
+	if (xs->id.proto == IPPROTO_ESP) {
+		itd->flags |= IXGBE_ADVTXD_TUCMD_IPSEC_TYPE_ESP |
+			      IXGBE_ADVTXD_TUCMD_L4T_TCP;
+		if (protocol == htons(ETH_P_IP))
+			itd->flags |= IXGBE_ADVTXD_TUCMD_IPV4;
+		itd->trailer_len = xs->props.trailer_len;
+	}
+	if (tsa->encrypt)
+		itd->flags |= IXGBE_ADVTXD_TUCMD_IPSEC_ENCRYPT_EN;
+
+	return 1;
+}
+
+/**
  * ixgbe_ipsec_rx - decode ipsec bits from Rx descriptor
  * @rx_ring: receiving ring
  * @rx_desc: receive data descriptor
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
index f1bfae0..d7875b3 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
@@ -1261,7 +1261,7 @@ void ixgbe_clear_interrupt_scheme(struct ixgbe_adapter *adapter)
 }
 
 void ixgbe_tx_ctxtdesc(struct ixgbe_ring *tx_ring, u32 vlan_macip_lens,
-		       u32 fcoe_sof_eof, u32 type_tucmd, u32 mss_l4len_idx)
+		       u32 fceof_saidx, u32 type_tucmd, u32 mss_l4len_idx)
 {
 	struct ixgbe_adv_tx_context_desc *context_desc;
 	u16 i = tx_ring->next_to_use;
@@ -1275,7 +1275,7 @@ void ixgbe_tx_ctxtdesc(struct ixgbe_ring *tx_ring, u32 vlan_macip_lens,
 	type_tucmd |= IXGBE_TXD_CMD_DEXT | IXGBE_ADVTXD_DTYP_CTXT;
 
 	context_desc->vlan_macip_lens	= cpu_to_le32(vlan_macip_lens);
-	context_desc->seqnum_seed	= cpu_to_le32(fcoe_sof_eof);
+	context_desc->fceof_saidx	= cpu_to_le32(fceof_saidx);
 	context_desc->type_tucmd_mlhl	= cpu_to_le32(type_tucmd);
 	context_desc->mss_l4len_idx	= cpu_to_le32(mss_l4len_idx);
 }
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 60f9f2d..c857594 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -7659,9 +7659,10 @@ static void ixgbe_service_task(struct work_struct *work)
 
 static int ixgbe_tso(struct ixgbe_ring *tx_ring,
 		     struct ixgbe_tx_buffer *first,
-		     u8 *hdr_len)
+		     u8 *hdr_len,
+		     struct ixgbe_ipsec_tx_data *itd)
 {
-	u32 vlan_macip_lens, type_tucmd, mss_l4len_idx;
+	u32 vlan_macip_lens, type_tucmd, mss_l4len_idx, fceof_saidx = 0;
 	struct sk_buff *skb = first->skb;
 	union {
 		struct iphdr *v4;
@@ -7740,7 +7741,12 @@ static int ixgbe_tso(struct ixgbe_ring *tx_ring,
 	vlan_macip_lens |= (ip.hdr - skb->data) << IXGBE_ADVTXD_MACLEN_SHIFT;
 	vlan_macip_lens |= first->tx_flags & IXGBE_TX_FLAGS_VLAN_MASK;
 
-	ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, 0, type_tucmd,
+	if (first->tx_flags & IXGBE_TX_FLAGS_IPSEC) {
+		fceof_saidx |= itd->sa_idx;
+		type_tucmd |= itd->flags | itd->trailer_len;
+	}
+
+	ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, fceof_saidx, type_tucmd,
 			  mss_l4len_idx);
 
 	return 1;
@@ -7756,10 +7762,12 @@ static inline bool ixgbe_ipv6_csum_is_sctp(struct sk_buff *skb)
 }
 
 static void ixgbe_tx_csum(struct ixgbe_ring *tx_ring,
-			  struct ixgbe_tx_buffer *first)
+			  struct ixgbe_tx_buffer *first,
+			  struct ixgbe_ipsec_tx_data *itd)
 {
 	struct sk_buff *skb = first->skb;
 	u32 vlan_macip_lens = 0;
+	u32 fceof_saidx = 0;
 	u32 type_tucmd = 0;
 
 	if (skb->ip_summed != CHECKSUM_PARTIAL) {
@@ -7800,7 +7808,12 @@ static void ixgbe_tx_csum(struct ixgbe_ring *tx_ring,
 	vlan_macip_lens |= skb_network_offset(skb) << IXGBE_ADVTXD_MACLEN_SHIFT;
 	vlan_macip_lens |= first->tx_flags & IXGBE_TX_FLAGS_VLAN_MASK;
 
-	ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, 0, type_tucmd, 0);
+	if (first->tx_flags & IXGBE_TX_FLAGS_IPSEC) {
+		fceof_saidx |= itd->sa_idx;
+		type_tucmd |= itd->flags | itd->trailer_len;
+	}
+
+	ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, fceof_saidx, type_tucmd, 0);
 }
 
 #define IXGBE_SET_FLAG(_input, _flag, _result) \
@@ -7843,11 +7856,16 @@ static void ixgbe_tx_olinfo_status(union ixgbe_adv_tx_desc *tx_desc,
 					IXGBE_TX_FLAGS_CSUM,
 					IXGBE_ADVTXD_POPTS_TXSM);
 
-	/* enble IPv4 checksum for TSO */
+	/* enable IPv4 checksum for TSO */
 	olinfo_status |= IXGBE_SET_FLAG(tx_flags,
 					IXGBE_TX_FLAGS_IPV4,
 					IXGBE_ADVTXD_POPTS_IXSM);
 
+	/* enable IPsec */
+	olinfo_status |= IXGBE_SET_FLAG(tx_flags,
+					IXGBE_TX_FLAGS_IPSEC,
+					IXGBE_ADVTXD_POPTS_IPSEC);
+
 	/*
 	 * Check Context must be set if Tx switch is enabled, which it
 	 * always is for case where virtual functions are running
@@ -8306,6 +8324,7 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff *skb,
 	u32 tx_flags = 0;
 	unsigned short f;
 	u16 count = TXD_USE_COUNT(skb_headlen(skb));
+	struct ixgbe_ipsec_tx_data ipsec_tx = { 0 };
 	__be16 protocol = skb->protocol;
 	u8 hdr_len = 0;
 
@@ -8394,6 +8413,9 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff *skb,
 		}
 	}
 
+	if (skb->sp && ixgbe_ipsec_tx(tx_ring, skb, protocol, &ipsec_tx))
+		tx_flags |= IXGBE_TX_FLAGS_IPSEC | IXGBE_TX_FLAGS_CC;
+
 	/* record initial flags and protocol */
 	first->tx_flags = tx_flags;
 	first->protocol = protocol;
@@ -8410,11 +8432,11 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff *skb,
 	}
 
 #endif /* IXGBE_FCOE */
-	tso = ixgbe_tso(tx_ring, first, &hdr_len);
+	tso = ixgbe_tso(tx_ring, first, &hdr_len, &ipsec_tx);
 	if (tso < 0)
 		goto out_drop;
 	else if (!tso)
-		ixgbe_tx_csum(tx_ring, first);
+		ixgbe_tx_csum(tx_ring, first, &ipsec_tx);
 
 	/* add the ATR filter if ATR is on */
 	if (test_bit(__IXGBE_TX_FDIR_INIT_DONE, &tx_ring->state))
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
index 3df0763..0ac725fa 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
@@ -2856,7 +2856,7 @@ union ixgbe_adv_rx_desc {
 /* Context descriptors */
 struct ixgbe_adv_tx_context_desc {
 	__le32 vlan_macip_lens;
-	__le32 seqnum_seed;
+	__le32 fceof_saidx;
 	__le32 type_tucmd_mlhl;
 	__le32 mss_l4len_idx;
 };
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [next-queue 09/10] ixgbe: ipsec offload stats
  2017-12-05  5:35 ` [Intel-wired-lan] " Shannon Nelson
@ 2017-12-05  5:35   ` Shannon Nelson
  -1 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-05  5:35 UTC (permalink / raw)
  To: intel-wired-lan, jeffrey.t.kirsher
  Cc: steffen.klassert, sowmini.varadhan, netdev

Add a simple statistic to count the ipsec offloads.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe.h         |  1 +
 drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c | 28 ++++++++++++++----------
 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c   |  3 +++
 3 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index 68097fe..bb66c85 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -265,6 +265,7 @@ struct ixgbe_rx_buffer {
 struct ixgbe_queue_stats {
 	u64 packets;
 	u64 bytes;
+	u64 ipsec_offloads;
 };
 
 struct ixgbe_tx_queue_stats {
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
index c3e7a81..dddbc74 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
@@ -1233,34 +1233,34 @@ static void ixgbe_get_ethtool_stats(struct net_device *netdev,
 	for (j = 0; j < netdev->num_tx_queues; j++) {
 		ring = adapter->tx_ring[j];
 		if (!ring) {
-			data[i] = 0;
-			data[i+1] = 0;
-			i += 2;
+			data[i++] = 0;
+			data[i++] = 0;
+			data[i++] = 0;
 			continue;
 		}
 
 		do {
 			start = u64_stats_fetch_begin_irq(&ring->syncp);
-			data[i]   = ring->stats.packets;
-			data[i+1] = ring->stats.bytes;
+			data[i++] = ring->stats.packets;
+			data[i++] = ring->stats.bytes;
+			data[i++] = ring->stats.ipsec_offloads;
 		} while (u64_stats_fetch_retry_irq(&ring->syncp, start));
-		i += 2;
 	}
 	for (j = 0; j < IXGBE_NUM_RX_QUEUES; j++) {
 		ring = adapter->rx_ring[j];
 		if (!ring) {
-			data[i] = 0;
-			data[i+1] = 0;
-			i += 2;
+			data[i++] = 0;
+			data[i++] = 0;
+			data[i++] = 0;
 			continue;
 		}
 
 		do {
 			start = u64_stats_fetch_begin_irq(&ring->syncp);
-			data[i]   = ring->stats.packets;
-			data[i+1] = ring->stats.bytes;
+			data[i++] = ring->stats.packets;
+			data[i++] = ring->stats.bytes;
+			data[i++] = ring->stats.ipsec_offloads;
 		} while (u64_stats_fetch_retry_irq(&ring->syncp, start));
-		i += 2;
 	}
 
 	for (j = 0; j < IXGBE_MAX_PACKET_BUFFERS; j++) {
@@ -1297,12 +1297,16 @@ static void ixgbe_get_strings(struct net_device *netdev, u32 stringset,
 			p += ETH_GSTRING_LEN;
 			sprintf(p, "tx_queue_%u_bytes", i);
 			p += ETH_GSTRING_LEN;
+			sprintf(p, "tx_queue_%u_ipsec_offloads", i);
+			p += ETH_GSTRING_LEN;
 		}
 		for (i = 0; i < IXGBE_NUM_RX_QUEUES; i++) {
 			sprintf(p, "rx_queue_%u_packets", i);
 			p += ETH_GSTRING_LEN;
 			sprintf(p, "rx_queue_%u_bytes", i);
 			p += ETH_GSTRING_LEN;
+			sprintf(p, "rx_queue_%u_ipsec_offloads", i);
+			p += ETH_GSTRING_LEN;
 		}
 		for (i = 0; i < IXGBE_MAX_PACKET_BUFFERS; i++) {
 			sprintf(p, "tx_pb_%u_pxon", i);
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
index 2a0dd7a..d1220bf 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
@@ -782,6 +782,7 @@ int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring, struct sk_buff *skb,
 	if (tsa->encrypt)
 		itd->flags |= IXGBE_ADVTXD_TUCMD_IPSEC_ENCRYPT_EN;
 
+	tx_ring->stats.ipsec_offloads++;
 	return 1;
 }
 
@@ -843,6 +844,8 @@ void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
 	xo = xfrm_offload(skb);
 	xo->flags = CRYPTO_DONE;
 	xo->status = CRYPTO_SUCCESS;
+
+	rx_ring->stats.ipsec_offloads++;
 }
 
 /**
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 09/10] ixgbe: ipsec offload stats
@ 2017-12-05  5:35   ` Shannon Nelson
  0 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-05  5:35 UTC (permalink / raw)
  To: intel-wired-lan

Add a simple statistic to count the ipsec offloads.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe.h         |  1 +
 drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c | 28 ++++++++++++++----------
 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c   |  3 +++
 3 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index 68097fe..bb66c85 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -265,6 +265,7 @@ struct ixgbe_rx_buffer {
 struct ixgbe_queue_stats {
 	u64 packets;
 	u64 bytes;
+	u64 ipsec_offloads;
 };
 
 struct ixgbe_tx_queue_stats {
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
index c3e7a81..dddbc74 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
@@ -1233,34 +1233,34 @@ static void ixgbe_get_ethtool_stats(struct net_device *netdev,
 	for (j = 0; j < netdev->num_tx_queues; j++) {
 		ring = adapter->tx_ring[j];
 		if (!ring) {
-			data[i] = 0;
-			data[i+1] = 0;
-			i += 2;
+			data[i++] = 0;
+			data[i++] = 0;
+			data[i++] = 0;
 			continue;
 		}
 
 		do {
 			start = u64_stats_fetch_begin_irq(&ring->syncp);
-			data[i]   = ring->stats.packets;
-			data[i+1] = ring->stats.bytes;
+			data[i++] = ring->stats.packets;
+			data[i++] = ring->stats.bytes;
+			data[i++] = ring->stats.ipsec_offloads;
 		} while (u64_stats_fetch_retry_irq(&ring->syncp, start));
-		i += 2;
 	}
 	for (j = 0; j < IXGBE_NUM_RX_QUEUES; j++) {
 		ring = adapter->rx_ring[j];
 		if (!ring) {
-			data[i] = 0;
-			data[i+1] = 0;
-			i += 2;
+			data[i++] = 0;
+			data[i++] = 0;
+			data[i++] = 0;
 			continue;
 		}
 
 		do {
 			start = u64_stats_fetch_begin_irq(&ring->syncp);
-			data[i]   = ring->stats.packets;
-			data[i+1] = ring->stats.bytes;
+			data[i++] = ring->stats.packets;
+			data[i++] = ring->stats.bytes;
+			data[i++] = ring->stats.ipsec_offloads;
 		} while (u64_stats_fetch_retry_irq(&ring->syncp, start));
-		i += 2;
 	}
 
 	for (j = 0; j < IXGBE_MAX_PACKET_BUFFERS; j++) {
@@ -1297,12 +1297,16 @@ static void ixgbe_get_strings(struct net_device *netdev, u32 stringset,
 			p += ETH_GSTRING_LEN;
 			sprintf(p, "tx_queue_%u_bytes", i);
 			p += ETH_GSTRING_LEN;
+			sprintf(p, "tx_queue_%u_ipsec_offloads", i);
+			p += ETH_GSTRING_LEN;
 		}
 		for (i = 0; i < IXGBE_NUM_RX_QUEUES; i++) {
 			sprintf(p, "rx_queue_%u_packets", i);
 			p += ETH_GSTRING_LEN;
 			sprintf(p, "rx_queue_%u_bytes", i);
 			p += ETH_GSTRING_LEN;
+			sprintf(p, "rx_queue_%u_ipsec_offloads", i);
+			p += ETH_GSTRING_LEN;
 		}
 		for (i = 0; i < IXGBE_MAX_PACKET_BUFFERS; i++) {
 			sprintf(p, "tx_pb_%u_pxon", i);
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
index 2a0dd7a..d1220bf 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
@@ -782,6 +782,7 @@ int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring, struct sk_buff *skb,
 	if (tsa->encrypt)
 		itd->flags |= IXGBE_ADVTXD_TUCMD_IPSEC_ENCRYPT_EN;
 
+	tx_ring->stats.ipsec_offloads++;
 	return 1;
 }
 
@@ -843,6 +844,8 @@ void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
 	xo = xfrm_offload(skb);
 	xo->flags = CRYPTO_DONE;
 	xo->status = CRYPTO_SUCCESS;
+
+	rx_ring->stats.ipsec_offloads++;
 }
 
 /**
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [next-queue 10/10] ixgbe: register ipsec offload with the xfrm subsystem
  2017-12-05  5:35 ` [Intel-wired-lan] " Shannon Nelson
@ 2017-12-05  5:35   ` Shannon Nelson
  -1 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-05  5:35 UTC (permalink / raw)
  To: intel-wired-lan, jeffrey.t.kirsher
  Cc: steffen.klassert, sowmini.varadhan, netdev

With all the support code in place we can now link in the ipsec
offload operations and set the ESP feature flag for the XFRM
subsystem to see.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 4 ++++
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  | 4 ++++
 2 files changed, 8 insertions(+)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
index d1220bf..0d5497b 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
@@ -884,6 +884,10 @@ void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter)
 	ixgbe_ipsec_clear_hw_tables(adapter);
 	ixgbe_ipsec_stop_engine(adapter);
 
+	adapter->netdev->xfrmdev_ops = &ixgbe_xfrmdev_ops;
+	adapter->netdev->features |= NETIF_F_HW_ESP;
+	adapter->netdev->hw_enc_features |= NETIF_F_HW_ESP;
+
 	return;
 err:
 	if (ipsec) {
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index c857594..9231351 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -9799,6 +9799,10 @@ ixgbe_features_check(struct sk_buff *skb, struct net_device *dev,
 	if (skb->encapsulation && !(features & NETIF_F_TSO_MANGLEID))
 		features &= ~NETIF_F_TSO;
 
+	/* IPsec offload doesn't get along well with others *yet* */
+	if (skb->sp)
+		features &= ~(NETIF_F_TSO | NETIF_F_HW_CSUM_BIT);
+
 	return features;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 10/10] ixgbe: register ipsec offload with the xfrm subsystem
@ 2017-12-05  5:35   ` Shannon Nelson
  0 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-05  5:35 UTC (permalink / raw)
  To: intel-wired-lan

With all the support code in place we can now link in the ipsec
offload operations and set the ESP feature flag for the XFRM
subsystem to see.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 4 ++++
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  | 4 ++++
 2 files changed, 8 insertions(+)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
index d1220bf..0d5497b 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
@@ -884,6 +884,10 @@ void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter)
 	ixgbe_ipsec_clear_hw_tables(adapter);
 	ixgbe_ipsec_stop_engine(adapter);
 
+	adapter->netdev->xfrmdev_ops = &ixgbe_xfrmdev_ops;
+	adapter->netdev->features |= NETIF_F_HW_ESP;
+	adapter->netdev->hw_enc_features |= NETIF_F_HW_ESP;
+
 	return;
 err:
 	if (ipsec) {
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index c857594..9231351 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -9799,6 +9799,10 @@ ixgbe_features_check(struct sk_buff *skb, struct net_device *dev,
 	if (skb->encapsulation && !(features & NETIF_F_TSO_MANGLEID))
 		features &= ~NETIF_F_TSO;
 
+	/* IPsec offload doesn't get along well with others *yet* */
+	if (skb->sp)
+		features &= ~(NETIF_F_TSO | NETIF_F_HW_CSUM_BIT);
+
 	return features;
 }
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* Re: [Intel-wired-lan] [next-queue 03/10] ixgbe: add ipsec engine start and stop routines
  2017-12-05  5:35   ` [Intel-wired-lan] " Shannon Nelson
@ 2017-12-05 16:22     ` Alexander Duyck
  -1 siblings, 0 replies; 78+ messages in thread
From: Alexander Duyck @ 2017-12-05 16:22 UTC (permalink / raw)
  To: Shannon Nelson
  Cc: intel-wired-lan, Jeff Kirsher, Steffen Klassert,
	Sowmini Varadhan, Netdev

On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
<shannon.nelson@oracle.com> wrote:
> Add in the code for running and stopping the hardware ipsec
> encryption/decryption engine.  It is good to keep the engine
> off when not in use in order to save on the power draw.
>
> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
> ---
>  drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 140 +++++++++++++++++++++++++
>  1 file changed, 140 insertions(+)
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> index 14dd011..38a1a16 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> @@ -148,10 +148,150 @@ void ixgbe_ipsec_clear_hw_tables(struct ixgbe_adapter *adapter)
>  }
>
>  /**
> + * ixgbe_ipsec_stop_data
> + * @adapter: board private structure
> + **/
> +static void ixgbe_ipsec_stop_data(struct ixgbe_adapter *adapter)
> +{
> +       struct ixgbe_hw *hw = &adapter->hw;
> +       bool link = adapter->link_up;
> +       u32 t_rdy, r_rdy;
> +       u32 reg;
> +
> +       /* halt data paths */
> +       reg = IXGBE_READ_REG(hw, IXGBE_SECTXCTRL);
> +       reg |= IXGBE_SECTXCTRL_TX_DIS;
> +       IXGBE_WRITE_REG(hw, IXGBE_SECTXCTRL, reg);
> +
> +       reg = IXGBE_READ_REG(hw, IXGBE_SECRXCTRL);
> +       reg |= IXGBE_SECRXCTRL_RX_DIS;
> +       IXGBE_WRITE_REG(hw, IXGBE_SECRXCTRL, reg);
> +
> +       IXGBE_WRITE_FLUSH(hw);
> +
> +       /* If the tx fifo doesn't have link, but still has data,
> +        * we can't clear the tx sec block.  Set the MAC loopback
> +        * before block clear
> +        */
> +       if (!link) {
> +               reg = IXGBE_READ_REG(hw, IXGBE_MACC);
> +               reg |= IXGBE_MACC_FLU;
> +               IXGBE_WRITE_REG(hw, IXGBE_MACC, reg);
> +
> +               reg = IXGBE_READ_REG(hw, IXGBE_HLREG0);
> +               reg |= IXGBE_HLREG0_LPBK;
> +               IXGBE_WRITE_REG(hw, IXGBE_HLREG0, reg);
> +
> +               IXGBE_WRITE_FLUSH(hw);
> +               mdelay(3);
> +       }
> +
> +       /* wait for the paths to empty */
> +       do {
> +               mdelay(10);
> +               t_rdy = IXGBE_READ_REG(hw, IXGBE_SECTXSTAT) &
> +                       IXGBE_SECTXSTAT_SECTX_RDY;
> +               r_rdy = IXGBE_READ_REG(hw, IXGBE_SECRXSTAT) &
> +                       IXGBE_SECRXSTAT_SECRX_RDY;
> +       } while (!t_rdy && !r_rdy);

This piece seems buggy to me. There should be some sort of limit on
how long you are willing to delay. Otherwise a surprise remove can
cause this to spin forever when the register reads return all 1's.

> +
> +       /* undo loopback if we played with it earlier */
> +       if (!link) {
> +               reg = IXGBE_READ_REG(hw, IXGBE_MACC);
> +               reg &= ~IXGBE_MACC_FLU;
> +               IXGBE_WRITE_REG(hw, IXGBE_MACC, reg);
> +
> +               reg = IXGBE_READ_REG(hw, IXGBE_HLREG0);
> +               reg &= ~IXGBE_HLREG0_LPBK;
> +               IXGBE_WRITE_REG(hw, IXGBE_HLREG0, reg);
> +
> +               IXGBE_WRITE_FLUSH(hw);
> +       }
> +}
> +
> +/**
> + * ixgbe_ipsec_stop_engine
> + * @adapter: board private structure
> + **/
> +static void ixgbe_ipsec_stop_engine(struct ixgbe_adapter *adapter)
> +{
> +       struct ixgbe_hw *hw = &adapter->hw;
> +       u32 reg;
> +
> +       ixgbe_ipsec_stop_data(adapter);
> +
> +       /* disable Rx and Tx SA lookup */
> +       IXGBE_WRITE_REG(hw, IXGBE_IPSTXIDX, 0);
> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXIDX, 0);
> +
> +       /* disable the Rx and Tx engines and full packet store-n-forward */
> +       reg = IXGBE_READ_REG(hw, IXGBE_SECTXCTRL);
> +       reg |= IXGBE_SECTXCTRL_SECTX_DIS;
> +       reg &= ~IXGBE_SECTXCTRL_STORE_FORWARD;
> +       IXGBE_WRITE_REG(hw, IXGBE_SECTXCTRL, reg);
> +
> +       reg = IXGBE_READ_REG(hw, IXGBE_SECRXCTRL);
> +       reg |= IXGBE_SECRXCTRL_SECRX_DIS;
> +       IXGBE_WRITE_REG(hw, IXGBE_SECRXCTRL, reg);
> +
> +       /* restore the "tx security buffer almost full threshold" to 0x250 */
> +       IXGBE_WRITE_REG(hw, IXGBE_SECTXBUFFAF, 0x250);
> +
> +       /* Set minimum IFG between packets back to the default 0x1 */
> +       reg = IXGBE_READ_REG(hw, IXGBE_SECTXMINIFG);
> +       reg = (reg & 0xfffffff0) | 0x1;
> +       IXGBE_WRITE_REG(hw, IXGBE_SECTXMINIFG, reg);
> +
> +       /* final set for normal (no ipsec offload) processing */
> +       IXGBE_WRITE_REG(hw, IXGBE_SECTXCTRL, IXGBE_SECTXCTRL_SECTX_DIS);
> +       IXGBE_WRITE_REG(hw, IXGBE_SECRXCTRL, IXGBE_SECRXCTRL_SECRX_DIS);
> +
> +       IXGBE_WRITE_FLUSH(hw);
> +}
> +
> +/**
> + * ixgbe_ipsec_start_engine
> + * @adapter: board private structure
> + *
> + * NOTE: this increases power consumption whether being used or not
> + **/
> +static void ixgbe_ipsec_start_engine(struct ixgbe_adapter *adapter)
> +{
> +       struct ixgbe_hw *hw = &adapter->hw;
> +       u32 reg;
> +
> +       ixgbe_ipsec_stop_data(adapter);
> +
> +       /* Set minimum IFG between packets to 3 */
> +       reg = IXGBE_READ_REG(hw, IXGBE_SECTXMINIFG);
> +       reg = (reg & 0xfffffff0) | 0x3;
> +       IXGBE_WRITE_REG(hw, IXGBE_SECTXMINIFG, reg);
> +
> +       /* Set "tx security buffer almost full threshold" to 0x15 so that the
> +        * almost full indication is generated only after buffer contains at
> +        * least an entire jumbo packet.
> +        */
> +       reg = IXGBE_READ_REG(hw, IXGBE_SECTXBUFFAF);
> +       reg = (reg & 0xfffffc00) | 0x15;
> +       IXGBE_WRITE_REG(hw, IXGBE_SECTXBUFFAF, reg);
> +
> +       /* restart the data paths by clearing the DISABLE bits */
> +       IXGBE_WRITE_REG(hw, IXGBE_SECRXCTRL, 0);
> +       IXGBE_WRITE_REG(hw, IXGBE_SECTXCTRL, IXGBE_SECTXCTRL_STORE_FORWARD);
> +
> +       /* enable Rx and Tx SA lookup */
> +       IXGBE_WRITE_REG(hw, IXGBE_IPSTXIDX, IXGBE_RXTXIDX_IPS_EN);
> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXIDX, IXGBE_RXTXIDX_IPS_EN);
> +
> +       IXGBE_WRITE_FLUSH(hw);
> +}
> +

It would probably make sense to add a data member to the hardware
structure that tracks if you have IPsec enabled or not. Then you don't
have to track the IPS_EN bits in patch 2 like you currently are and
could instead either not do IPsec SA updates if IPsec is not enabled,
or use the enable value to determine what you write for IPS_EN instead
of having to read registers.

> +/**
>   * ixgbe_init_ipsec_offload - initialize security registers for IPSec operation
>   * @adapter: board private structure
>   **/
>  void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter)
>  {
>         ixgbe_ipsec_clear_hw_tables(adapter);
> +       ixgbe_ipsec_stop_engine(adapter);
>  }
> --
> 2.7.4
>
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan@osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 03/10] ixgbe: add ipsec engine start and stop routines
@ 2017-12-05 16:22     ` Alexander Duyck
  0 siblings, 0 replies; 78+ messages in thread
From: Alexander Duyck @ 2017-12-05 16:22 UTC (permalink / raw)
  To: intel-wired-lan

On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
<shannon.nelson@oracle.com> wrote:
> Add in the code for running and stopping the hardware ipsec
> encryption/decryption engine.  It is good to keep the engine
> off when not in use in order to save on the power draw.
>
> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
> ---
>  drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 140 +++++++++++++++++++++++++
>  1 file changed, 140 insertions(+)
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> index 14dd011..38a1a16 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> @@ -148,10 +148,150 @@ void ixgbe_ipsec_clear_hw_tables(struct ixgbe_adapter *adapter)
>  }
>
>  /**
> + * ixgbe_ipsec_stop_data
> + * @adapter: board private structure
> + **/
> +static void ixgbe_ipsec_stop_data(struct ixgbe_adapter *adapter)
> +{
> +       struct ixgbe_hw *hw = &adapter->hw;
> +       bool link = adapter->link_up;
> +       u32 t_rdy, r_rdy;
> +       u32 reg;
> +
> +       /* halt data paths */
> +       reg = IXGBE_READ_REG(hw, IXGBE_SECTXCTRL);
> +       reg |= IXGBE_SECTXCTRL_TX_DIS;
> +       IXGBE_WRITE_REG(hw, IXGBE_SECTXCTRL, reg);
> +
> +       reg = IXGBE_READ_REG(hw, IXGBE_SECRXCTRL);
> +       reg |= IXGBE_SECRXCTRL_RX_DIS;
> +       IXGBE_WRITE_REG(hw, IXGBE_SECRXCTRL, reg);
> +
> +       IXGBE_WRITE_FLUSH(hw);
> +
> +       /* If the tx fifo doesn't have link, but still has data,
> +        * we can't clear the tx sec block.  Set the MAC loopback
> +        * before block clear
> +        */
> +       if (!link) {
> +               reg = IXGBE_READ_REG(hw, IXGBE_MACC);
> +               reg |= IXGBE_MACC_FLU;
> +               IXGBE_WRITE_REG(hw, IXGBE_MACC, reg);
> +
> +               reg = IXGBE_READ_REG(hw, IXGBE_HLREG0);
> +               reg |= IXGBE_HLREG0_LPBK;
> +               IXGBE_WRITE_REG(hw, IXGBE_HLREG0, reg);
> +
> +               IXGBE_WRITE_FLUSH(hw);
> +               mdelay(3);
> +       }
> +
> +       /* wait for the paths to empty */
> +       do {
> +               mdelay(10);
> +               t_rdy = IXGBE_READ_REG(hw, IXGBE_SECTXSTAT) &
> +                       IXGBE_SECTXSTAT_SECTX_RDY;
> +               r_rdy = IXGBE_READ_REG(hw, IXGBE_SECRXSTAT) &
> +                       IXGBE_SECRXSTAT_SECRX_RDY;
> +       } while (!t_rdy && !r_rdy);

This piece seems buggy to me. There should be some sort of limit on
how long you are willing to delay. Otherwise a surprise remove can
cause this to spin forever when the register reads return all 1's.

> +
> +       /* undo loopback if we played with it earlier */
> +       if (!link) {
> +               reg = IXGBE_READ_REG(hw, IXGBE_MACC);
> +               reg &= ~IXGBE_MACC_FLU;
> +               IXGBE_WRITE_REG(hw, IXGBE_MACC, reg);
> +
> +               reg = IXGBE_READ_REG(hw, IXGBE_HLREG0);
> +               reg &= ~IXGBE_HLREG0_LPBK;
> +               IXGBE_WRITE_REG(hw, IXGBE_HLREG0, reg);
> +
> +               IXGBE_WRITE_FLUSH(hw);
> +       }
> +}
> +
> +/**
> + * ixgbe_ipsec_stop_engine
> + * @adapter: board private structure
> + **/
> +static void ixgbe_ipsec_stop_engine(struct ixgbe_adapter *adapter)
> +{
> +       struct ixgbe_hw *hw = &adapter->hw;
> +       u32 reg;
> +
> +       ixgbe_ipsec_stop_data(adapter);
> +
> +       /* disable Rx and Tx SA lookup */
> +       IXGBE_WRITE_REG(hw, IXGBE_IPSTXIDX, 0);
> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXIDX, 0);
> +
> +       /* disable the Rx and Tx engines and full packet store-n-forward */
> +       reg = IXGBE_READ_REG(hw, IXGBE_SECTXCTRL);
> +       reg |= IXGBE_SECTXCTRL_SECTX_DIS;
> +       reg &= ~IXGBE_SECTXCTRL_STORE_FORWARD;
> +       IXGBE_WRITE_REG(hw, IXGBE_SECTXCTRL, reg);
> +
> +       reg = IXGBE_READ_REG(hw, IXGBE_SECRXCTRL);
> +       reg |= IXGBE_SECRXCTRL_SECRX_DIS;
> +       IXGBE_WRITE_REG(hw, IXGBE_SECRXCTRL, reg);
> +
> +       /* restore the "tx security buffer almost full threshold" to 0x250 */
> +       IXGBE_WRITE_REG(hw, IXGBE_SECTXBUFFAF, 0x250);
> +
> +       /* Set minimum IFG between packets back to the default 0x1 */
> +       reg = IXGBE_READ_REG(hw, IXGBE_SECTXMINIFG);
> +       reg = (reg & 0xfffffff0) | 0x1;
> +       IXGBE_WRITE_REG(hw, IXGBE_SECTXMINIFG, reg);
> +
> +       /* final set for normal (no ipsec offload) processing */
> +       IXGBE_WRITE_REG(hw, IXGBE_SECTXCTRL, IXGBE_SECTXCTRL_SECTX_DIS);
> +       IXGBE_WRITE_REG(hw, IXGBE_SECRXCTRL, IXGBE_SECRXCTRL_SECRX_DIS);
> +
> +       IXGBE_WRITE_FLUSH(hw);
> +}
> +
> +/**
> + * ixgbe_ipsec_start_engine
> + * @adapter: board private structure
> + *
> + * NOTE: this increases power consumption whether being used or not
> + **/
> +static void ixgbe_ipsec_start_engine(struct ixgbe_adapter *adapter)
> +{
> +       struct ixgbe_hw *hw = &adapter->hw;
> +       u32 reg;
> +
> +       ixgbe_ipsec_stop_data(adapter);
> +
> +       /* Set minimum IFG between packets to 3 */
> +       reg = IXGBE_READ_REG(hw, IXGBE_SECTXMINIFG);
> +       reg = (reg & 0xfffffff0) | 0x3;
> +       IXGBE_WRITE_REG(hw, IXGBE_SECTXMINIFG, reg);
> +
> +       /* Set "tx security buffer almost full threshold" to 0x15 so that the
> +        * almost full indication is generated only after buffer contains at
> +        * least an entire jumbo packet.
> +        */
> +       reg = IXGBE_READ_REG(hw, IXGBE_SECTXBUFFAF);
> +       reg = (reg & 0xfffffc00) | 0x15;
> +       IXGBE_WRITE_REG(hw, IXGBE_SECTXBUFFAF, reg);
> +
> +       /* restart the data paths by clearing the DISABLE bits */
> +       IXGBE_WRITE_REG(hw, IXGBE_SECRXCTRL, 0);
> +       IXGBE_WRITE_REG(hw, IXGBE_SECTXCTRL, IXGBE_SECTXCTRL_STORE_FORWARD);
> +
> +       /* enable Rx and Tx SA lookup */
> +       IXGBE_WRITE_REG(hw, IXGBE_IPSTXIDX, IXGBE_RXTXIDX_IPS_EN);
> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXIDX, IXGBE_RXTXIDX_IPS_EN);
> +
> +       IXGBE_WRITE_FLUSH(hw);
> +}
> +

It would probably make sense to add a data member to the hardware
structure that tracks if you have IPsec enabled or not. Then you don't
have to track the IPS_EN bits in patch 2 like you currently are and
could instead either not do IPsec SA updates if IPsec is not enabled,
or use the enable value to determine what you write for IPS_EN instead
of having to read registers.

> +/**
>   * ixgbe_init_ipsec_offload - initialize security registers for IPSec operation
>   * @adapter: board private structure
>   **/
>  void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter)
>  {
>         ixgbe_ipsec_clear_hw_tables(adapter);
> +       ixgbe_ipsec_stop_engine(adapter);
>  }
> --
> 2.7.4
>
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan at osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [Intel-wired-lan] [next-queue 02/10] ixgbe: add ipsec register access routines
  2017-12-05  5:35   ` [Intel-wired-lan] " Shannon Nelson
@ 2017-12-05 16:24     ` Rustad, Mark D
  -1 siblings, 0 replies; 78+ messages in thread
From: Rustad, Mark D @ 2017-12-05 16:24 UTC (permalink / raw)
  To: Shannon Nelson
  Cc: intel-wired-lan, Kirsher, Jeffrey T, steffen.klassert,
	sowmini.varadhan, netdev


> On Dec 4, 2017, at 9:35 PM, Shannon Nelson <shannon.nelson@oracle.com> wrote:
> 
> Add a few routines to make access to the ipsec registers just a little
> easier, and throw in the beginnings of an initialization.
> 
> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
> ---
> drivers/net/ethernet/intel/ixgbe/Makefile      |   1 +
> drivers/net/ethernet/intel/ixgbe/ixgbe.h       |   6 +
> drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 157 +++++++++++++++++++++++++
> drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h |  50 ++++++++
> drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  |   1 +

<snip>

> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> new file mode 100644
> index 0000000..14dd011
> --- /dev/null
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> @@ -0,0 +1,157 @@
> +/*******************************************************************************
> + *
> + * Intel 10 Gigabit PCI Express Linux driver
> + * Copyright(c) 2017 Oracle and/or its affiliates. All rights reserved.

I don't think that it really makes sense to assert "All rights reserved" in something that is GPL. It makes it seem like something is being asserted that is counter to the GPL.

<snip>

> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
> new file mode 100644
> index 0000000..017b13f
> --- /dev/null
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
> @@ -0,0 +1,50 @@
> +/*******************************************************************************
> +
> +  Intel 10 Gigabit PCI Express Linux driver
> +  Copyright(c) 2017 Oracle and/or its affiliates. All rights reserved.

Likewise here.

-- 
Mark Rustad, Networking Division, Intel Corporation

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 02/10] ixgbe: add ipsec register access routines
@ 2017-12-05 16:24     ` Rustad, Mark D
  0 siblings, 0 replies; 78+ messages in thread
From: Rustad, Mark D @ 2017-12-05 16:24 UTC (permalink / raw)
  To: intel-wired-lan


> On Dec 4, 2017, at 9:35 PM, Shannon Nelson <shannon.nelson@oracle.com> wrote:
> 
> Add a few routines to make access to the ipsec registers just a little
> easier, and throw in the beginnings of an initialization.
> 
> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
> ---
> drivers/net/ethernet/intel/ixgbe/Makefile      |   1 +
> drivers/net/ethernet/intel/ixgbe/ixgbe.h       |   6 +
> drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 157 +++++++++++++++++++++++++
> drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h |  50 ++++++++
> drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  |   1 +

<snip>

> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> new file mode 100644
> index 0000000..14dd011
> --- /dev/null
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> @@ -0,0 +1,157 @@
> +/*******************************************************************************
> + *
> + * Intel 10 Gigabit PCI Express Linux driver
> + * Copyright(c) 2017 Oracle and/or its affiliates. All rights reserved.

I don't think that it really makes sense to assert "All rights reserved" in something that is GPL. It makes it seem like something is being asserted that is counter to the GPL.

<snip>

> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
> new file mode 100644
> index 0000000..017b13f
> --- /dev/null
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
> @@ -0,0 +1,50 @@
> +/*******************************************************************************
> +
> +  Intel 10 Gigabit PCI Express Linux driver
> +  Copyright(c) 2017 Oracle and/or its affiliates. All rights reserved.

Likewise here.

-- 
Mark Rustad, Networking Division, Intel Corporation


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [Intel-wired-lan] [next-queue 02/10] ixgbe: add ipsec register access routines
  2017-12-05  5:35   ` [Intel-wired-lan] " Shannon Nelson
@ 2017-12-05 16:56     ` Alexander Duyck
  -1 siblings, 0 replies; 78+ messages in thread
From: Alexander Duyck @ 2017-12-05 16:56 UTC (permalink / raw)
  To: Shannon Nelson
  Cc: intel-wired-lan, Jeff Kirsher, Steffen Klassert,
	Sowmini Varadhan, Netdev

On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
<shannon.nelson@oracle.com> wrote:
> Add a few routines to make access to the ipsec registers just a little
> easier, and throw in the beginnings of an initialization.
>
> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
> ---
>  drivers/net/ethernet/intel/ixgbe/Makefile      |   1 +
>  drivers/net/ethernet/intel/ixgbe/ixgbe.h       |   6 +
>  drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 157 +++++++++++++++++++++++++
>  drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h |  50 ++++++++
>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  |   1 +
>  5 files changed, 215 insertions(+)
>  create mode 100644 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>  create mode 100644 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/Makefile b/drivers/net/ethernet/intel/ixgbe/Makefile
> index 35e6fa6..8319465 100644
> --- a/drivers/net/ethernet/intel/ixgbe/Makefile
> +++ b/drivers/net/ethernet/intel/ixgbe/Makefile
> @@ -42,3 +42,4 @@ ixgbe-$(CONFIG_IXGBE_DCB) +=  ixgbe_dcb.o ixgbe_dcb_82598.o \
>  ixgbe-$(CONFIG_IXGBE_HWMON) += ixgbe_sysfs.o
>  ixgbe-$(CONFIG_DEBUG_FS) += ixgbe_debugfs.o
>  ixgbe-$(CONFIG_FCOE:m=y) += ixgbe_fcoe.o
> +ixgbe-$(CONFIG_XFRM_OFFLOAD) += ixgbe_ipsec.o
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> index dd55787..1e11462 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> @@ -52,6 +52,7 @@
>  #ifdef CONFIG_IXGBE_DCA
>  #include <linux/dca.h>
>  #endif
> +#include "ixgbe_ipsec.h"
>
>  #include <net/busy_poll.h>
>
> @@ -1001,4 +1002,9 @@ void ixgbe_store_key(struct ixgbe_adapter *adapter);
>  void ixgbe_store_reta(struct ixgbe_adapter *adapter);
>  s32 ixgbe_negotiate_fc(struct ixgbe_hw *hw, u32 adv_reg, u32 lp_reg,
>                        u32 adv_sym, u32 adv_asm, u32 lp_sym, u32 lp_asm);
> +#ifdef CONFIG_XFRM_OFFLOAD
> +void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter);
> +#else
> +static inline void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter) { };
> +#endif /* CONFIG_XFRM_OFFLOAD */
>  #endif /* _IXGBE_H_ */
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> new file mode 100644
> index 0000000..14dd011
> --- /dev/null
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> @@ -0,0 +1,157 @@
> +/*******************************************************************************
> + *
> + * Intel 10 Gigabit PCI Express Linux driver
> + * Copyright(c) 2017 Oracle and/or its affiliates. All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program.  If not, see <http://www.gnu.org/licenses/>.
> + *
> + * The full GNU General Public License is included in this distribution in
> + * the file called "COPYING".
> + *
> + * Contact Information:
> + * Linux NICS <linux.nics@intel.com>
> + * e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
> + * Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
> + *
> + ******************************************************************************/
> +
> +#include "ixgbe.h"
> +
> +/**
> + * ixgbe_ipsec_set_tx_sa - set the Tx SA registers
> + * @hw: hw specific details
> + * @idx: register index to write
> + * @key: key byte array
> + * @salt: salt bytes
> + **/
> +static void ixgbe_ipsec_set_tx_sa(struct ixgbe_hw *hw, u16 idx,
> +                                 u32 key[], u32 salt)
> +{
> +       u32 reg;
> +       int i;
> +
> +       for (i = 0; i < 4; i++)
> +               IXGBE_WRITE_REG(hw, IXGBE_IPSTXKEY(i), cpu_to_be32(key[3-i]));
> +       IXGBE_WRITE_REG(hw, IXGBE_IPSTXSALT, cpu_to_be32(salt));
> +       IXGBE_WRITE_FLUSH(hw);
> +
> +       reg = IXGBE_READ_REG(hw, IXGBE_IPSTXIDX);
> +       reg &= IXGBE_RXTXIDX_IPS_EN;
> +       reg |= idx << 3 | IXGBE_RXTXIDX_IDX_WRITE;
> +       IXGBE_WRITE_REG(hw, IXGBE_IPSTXIDX, reg);
> +       IXGBE_WRITE_FLUSH(hw);
> +}
> +

So there are a few things here to unpack.

The first is the carry-forward of the IPS bit. I'm not sure that is
the best way to go. Do we really expect to be updating SA values if
IPsec offload is not enabled? If so we may just want to carry a bit
flag somewhere in the ixgbe_hw struct indicating if Tx IPsec offload
is enabled and use that to determine the value for this bit.

Also we should probably replace "3" with a value indicating that it is
the SA index shift.

Also technically the WRITE_FLUSH isn't needed if you are doing a PCIe
read anyway to get IPSTXIDX.

> +/**
> + * ixgbe_ipsec_set_rx_item - set an Rx table item
> + * @hw: hw specific details
> + * @idx: register index to write
> + * @tbl: table selector
> + *
> + * Trigger the device to store into a particular Rx table the
> + * data that has already been loaded into the input register
> + **/
> +static void ixgbe_ipsec_set_rx_item(struct ixgbe_hw *hw, u16 idx, u32 tbl)
> +{
> +       u32 reg;
> +
> +       reg = IXGBE_READ_REG(hw, IXGBE_IPSRXIDX);
> +       reg &= IXGBE_RXTXIDX_IPS_EN;
> +       reg |= tbl | idx << 3 | IXGBE_RXTXIDX_IDX_WRITE;
> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXIDX, reg);
> +       IXGBE_WRITE_FLUSH(hw);
> +}
> +

The Rx version of this gets a bit trickier since the datasheet
actually indicates there are a few different types of tables that can
be indexed via this. Also why is the tbl value not being shifted? It
seems like it should be shifted by 1 to avoid overwriting the IPS_EN
bit. Really I would like to see the tbl value converted to an enum and
shifted by 1 in order to generate the table reference.

Here the "3" is a table index. It might be nice to call that out with
a name instead of using the magic number.

> +/**
> + * ixgbe_ipsec_set_rx_sa - set up the register bits to save SA info
> + * @hw: hw specific details
> + * @idx: register index to write
> + * @spi: security parameter index
> + * @key: key byte array
> + * @salt: salt bytes
> + * @mode: rx decrypt control bits
> + * @ip_idx: index into IP table for related IP address
> + **/
> +static void ixgbe_ipsec_set_rx_sa(struct ixgbe_hw *hw, u16 idx, __be32 spi,
> +                                 u32 key[], u32 salt, u32 mode, u32 ip_idx)
> +{
> +       int i;
> +
> +       /* store the SPI (in bigendian) and IPidx */
> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXSPI, spi);
> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXIPIDX, ip_idx);
> +       IXGBE_WRITE_FLUSH(hw);
> +
> +       ixgbe_ipsec_set_rx_item(hw, idx, IXGBE_RXIDX_TBL_SPI);
> +
> +       /* store the key, salt, and mode */
> +       for (i = 0; i < 4; i++)
> +               IXGBE_WRITE_REG(hw, IXGBE_IPSRXKEY(i), cpu_to_be32(key[3-i]));
> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXSALT, cpu_to_be32(salt));
> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXMOD, mode);
> +       IXGBE_WRITE_FLUSH(hw);
> +
> +       ixgbe_ipsec_set_rx_item(hw, idx, IXGBE_RXIDX_TBL_KEY);
> +}

Is there any reason why you could write the SPI, key, salt, and mode,
then flush, and trigger the writes via the IPSRXIDX? Just wondering
since it would likely save you a few cycles avoiding PCIe bus stalls.

> +
> +/**
> + * ixgbe_ipsec_set_rx_ip - set up the register bits to save SA IP addr info
> + * @hw: hw specific details
> + * @idx: register index to write
> + * @addr: IP address byte array
> + **/
> +static void ixgbe_ipsec_set_rx_ip(struct ixgbe_hw *hw, u16 idx, u32 addr[])
> +{
> +       int i;
> +
> +       /* store the ip address */
> +       for (i = 0; i < 4; i++)
> +               IXGBE_WRITE_REG(hw, IXGBE_IPSRXIPADDR(i), addr[i]);
> +       IXGBE_WRITE_FLUSH(hw);
> +
> +       ixgbe_ipsec_set_rx_item(hw, idx, IXGBE_RXIDX_TBL_IP);
> +}
> +

This piece is kind of confusing. I would suggest storing the address
as a __be32 pointer instead of a u32 array. That way you start with
either an IPv6 or an IPv4 address at offset 0 instead of the way the
hardware is defined which has you writing it at either 0 or 3
depending on if the address is IPv6 or IPv4.

> +/**
> + * ixgbe_ipsec_clear_hw_tables - because some tables don't get cleared on reset
> + * @adapter: board private structure
> + **/
> +void ixgbe_ipsec_clear_hw_tables(struct ixgbe_adapter *adapter)
> +{
> +       struct ixgbe_hw *hw = &adapter->hw;
> +       u32 buf[4] = {0, 0, 0, 0};
> +       u16 idx;
> +
> +       /* disable Rx and Tx SA lookup */
> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXIDX, 0);
> +       IXGBE_WRITE_REG(hw, IXGBE_IPSTXIDX, 0);
> +
> +       /* scrub the tables */
> +       for (idx = 0; idx < IXGBE_IPSEC_MAX_SA_COUNT; idx++)
> +               ixgbe_ipsec_set_tx_sa(hw, idx, buf, 0);
> +
> +       for (idx = 0; idx < IXGBE_IPSEC_MAX_SA_COUNT; idx++)
> +               ixgbe_ipsec_set_rx_sa(hw, idx, 0, buf, 0, 0, 0);
> +
> +       for (idx = 0; idx < IXGBE_IPSEC_MAX_RX_IP_COUNT; idx++)
> +               ixgbe_ipsec_set_rx_ip(hw, idx, buf);
> +}
> +
> +/**
> + * ixgbe_init_ipsec_offload - initialize security registers for IPSec operation
> + * @adapter: board private structure
> + **/
> +void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter)
> +{
> +       ixgbe_ipsec_clear_hw_tables(adapter);
> +}
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
> new file mode 100644
> index 0000000..017b13f
> --- /dev/null
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
> @@ -0,0 +1,50 @@
> +/*******************************************************************************
> +
> +  Intel 10 Gigabit PCI Express Linux driver
> +  Copyright(c) 2017 Oracle and/or its affiliates. All rights reserved.
> +
> +  This program is free software; you can redistribute it and/or modify it
> +  under the terms and conditions of the GNU General Public License,
> +  version 2, as published by the Free Software Foundation.
> +
> +  This program is distributed in the hope it will be useful, but WITHOUT
> +  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> +  more details.
> +
> +  You should have received a copy of the GNU General Public License along with
> +  this program.  If not, see <http://www.gnu.org/licenses/>.
> +
> +  The full GNU General Public License is included in this distribution in
> +  the file called "COPYING".
> +
> +  Contact Information:
> +  Linux NICS <linux.nics@intel.com>
> +  e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
> +  Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
> +
> +*******************************************************************************/
> +
> +#ifndef _IXGBE_IPSEC_H_
> +#define _IXGBE_IPSEC_H_
> +
> +#define IXGBE_IPSEC_MAX_SA_COUNT       1024
> +#define IXGBE_IPSEC_MAX_RX_IP_COUNT    128
> +#define IXGBE_IPSEC_BASE_RX_INDEX      IXGBE_IPSEC_MAX_SA_COUNT
> +#define IXGBE_IPSEC_BASE_TX_INDEX      (IXGBE_IPSEC_MAX_SA_COUNT * 2)
> +
> +#define IXGBE_RXTXIDX_IPS_EN           0x00000001
> +#define IXGBE_RXIDX_TBL_MASK           0x00000006
> +#define IXGBE_RXIDX_TBL_IP             0x00000002
> +#define IXGBE_RXIDX_TBL_SPI            0x00000004
> +#define IXGBE_RXIDX_TBL_KEY            0x00000006

You might look at converting these table entries into an enum and add
a shift value. It will make things much easier to read.

> +#define IXGBE_RXTXIDX_IDX_MASK         0x00001ff8
> +#define IXGBE_RXTXIDX_IDX_READ         0x40000000
> +#define IXGBE_RXTXIDX_IDX_WRITE                0x80000000
> +
> +#define IXGBE_RXMOD_VALID              0x00000001
> +#define IXGBE_RXMOD_PROTO_ESP          0x00000004
> +#define IXGBE_RXMOD_DECRYPT            0x00000008
> +#define IXGBE_RXMOD_IPV6               0x00000010
> +
> +#endif /* _IXGBE_IPSEC_H_ */
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index 6d5f31e..51fb3cf 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -10327,6 +10327,7 @@ static int ixgbe_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>                                          NETIF_F_FCOE_MTU;
>         }
>  #endif /* IXGBE_FCOE */
> +       ixgbe_init_ipsec_offload(adapter);
>
>         if (adapter->flags2 & IXGBE_FLAG2_RSC_CAPABLE)
>                 netdev->hw_features |= NETIF_F_LRO;
> --
> 2.7.4
>
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan@osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 02/10] ixgbe: add ipsec register access routines
@ 2017-12-05 16:56     ` Alexander Duyck
  0 siblings, 0 replies; 78+ messages in thread
From: Alexander Duyck @ 2017-12-05 16:56 UTC (permalink / raw)
  To: intel-wired-lan

On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
<shannon.nelson@oracle.com> wrote:
> Add a few routines to make access to the ipsec registers just a little
> easier, and throw in the beginnings of an initialization.
>
> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
> ---
>  drivers/net/ethernet/intel/ixgbe/Makefile      |   1 +
>  drivers/net/ethernet/intel/ixgbe/ixgbe.h       |   6 +
>  drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 157 +++++++++++++++++++++++++
>  drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h |  50 ++++++++
>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  |   1 +
>  5 files changed, 215 insertions(+)
>  create mode 100644 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>  create mode 100644 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/Makefile b/drivers/net/ethernet/intel/ixgbe/Makefile
> index 35e6fa6..8319465 100644
> --- a/drivers/net/ethernet/intel/ixgbe/Makefile
> +++ b/drivers/net/ethernet/intel/ixgbe/Makefile
> @@ -42,3 +42,4 @@ ixgbe-$(CONFIG_IXGBE_DCB) +=  ixgbe_dcb.o ixgbe_dcb_82598.o \
>  ixgbe-$(CONFIG_IXGBE_HWMON) += ixgbe_sysfs.o
>  ixgbe-$(CONFIG_DEBUG_FS) += ixgbe_debugfs.o
>  ixgbe-$(CONFIG_FCOE:m=y) += ixgbe_fcoe.o
> +ixgbe-$(CONFIG_XFRM_OFFLOAD) += ixgbe_ipsec.o
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> index dd55787..1e11462 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> @@ -52,6 +52,7 @@
>  #ifdef CONFIG_IXGBE_DCA
>  #include <linux/dca.h>
>  #endif
> +#include "ixgbe_ipsec.h"
>
>  #include <net/busy_poll.h>
>
> @@ -1001,4 +1002,9 @@ void ixgbe_store_key(struct ixgbe_adapter *adapter);
>  void ixgbe_store_reta(struct ixgbe_adapter *adapter);
>  s32 ixgbe_negotiate_fc(struct ixgbe_hw *hw, u32 adv_reg, u32 lp_reg,
>                        u32 adv_sym, u32 adv_asm, u32 lp_sym, u32 lp_asm);
> +#ifdef CONFIG_XFRM_OFFLOAD
> +void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter);
> +#else
> +static inline void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter) { };
> +#endif /* CONFIG_XFRM_OFFLOAD */
>  #endif /* _IXGBE_H_ */
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> new file mode 100644
> index 0000000..14dd011
> --- /dev/null
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> @@ -0,0 +1,157 @@
> +/*******************************************************************************
> + *
> + * Intel 10 Gigabit PCI Express Linux driver
> + * Copyright(c) 2017 Oracle and/or its affiliates. All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program.  If not, see <http://www.gnu.org/licenses/>.
> + *
> + * The full GNU General Public License is included in this distribution in
> + * the file called "COPYING".
> + *
> + * Contact Information:
> + * Linux NICS <linux.nics@intel.com>
> + * e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
> + * Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
> + *
> + ******************************************************************************/
> +
> +#include "ixgbe.h"
> +
> +/**
> + * ixgbe_ipsec_set_tx_sa - set the Tx SA registers
> + * @hw: hw specific details
> + * @idx: register index to write
> + * @key: key byte array
> + * @salt: salt bytes
> + **/
> +static void ixgbe_ipsec_set_tx_sa(struct ixgbe_hw *hw, u16 idx,
> +                                 u32 key[], u32 salt)
> +{
> +       u32 reg;
> +       int i;
> +
> +       for (i = 0; i < 4; i++)
> +               IXGBE_WRITE_REG(hw, IXGBE_IPSTXKEY(i), cpu_to_be32(key[3-i]));
> +       IXGBE_WRITE_REG(hw, IXGBE_IPSTXSALT, cpu_to_be32(salt));
> +       IXGBE_WRITE_FLUSH(hw);
> +
> +       reg = IXGBE_READ_REG(hw, IXGBE_IPSTXIDX);
> +       reg &= IXGBE_RXTXIDX_IPS_EN;
> +       reg |= idx << 3 | IXGBE_RXTXIDX_IDX_WRITE;
> +       IXGBE_WRITE_REG(hw, IXGBE_IPSTXIDX, reg);
> +       IXGBE_WRITE_FLUSH(hw);
> +}
> +

So there are a few things here to unpack.

The first is the carry-forward of the IPS bit. I'm not sure that is
the best way to go. Do we really expect to be updating SA values if
IPsec offload is not enabled? If so we may just want to carry a bit
flag somewhere in the ixgbe_hw struct indicating if Tx IPsec offload
is enabled and use that to determine the value for this bit.

Also we should probably replace "3" with a value indicating that it is
the SA index shift.

Also technically the WRITE_FLUSH isn't needed if you are doing a PCIe
read anyway to get IPSTXIDX.

> +/**
> + * ixgbe_ipsec_set_rx_item - set an Rx table item
> + * @hw: hw specific details
> + * @idx: register index to write
> + * @tbl: table selector
> + *
> + * Trigger the device to store into a particular Rx table the
> + * data that has already been loaded into the input register
> + **/
> +static void ixgbe_ipsec_set_rx_item(struct ixgbe_hw *hw, u16 idx, u32 tbl)
> +{
> +       u32 reg;
> +
> +       reg = IXGBE_READ_REG(hw, IXGBE_IPSRXIDX);
> +       reg &= IXGBE_RXTXIDX_IPS_EN;
> +       reg |= tbl | idx << 3 | IXGBE_RXTXIDX_IDX_WRITE;
> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXIDX, reg);
> +       IXGBE_WRITE_FLUSH(hw);
> +}
> +

The Rx version of this gets a bit trickier since the datasheet
actually indicates there are a few different types of tables that can
be indexed via this. Also why is the tbl value not being shifted? It
seems like it should be shifted by 1 to avoid overwriting the IPS_EN
bit. Really I would like to see the tbl value converted to an enum and
shifted by 1 in order to generate the table reference.

Here the "3" is a table index. It might be nice to call that out with
a name instead of using the magic number.

> +/**
> + * ixgbe_ipsec_set_rx_sa - set up the register bits to save SA info
> + * @hw: hw specific details
> + * @idx: register index to write
> + * @spi: security parameter index
> + * @key: key byte array
> + * @salt: salt bytes
> + * @mode: rx decrypt control bits
> + * @ip_idx: index into IP table for related IP address
> + **/
> +static void ixgbe_ipsec_set_rx_sa(struct ixgbe_hw *hw, u16 idx, __be32 spi,
> +                                 u32 key[], u32 salt, u32 mode, u32 ip_idx)
> +{
> +       int i;
> +
> +       /* store the SPI (in bigendian) and IPidx */
> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXSPI, spi);
> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXIPIDX, ip_idx);
> +       IXGBE_WRITE_FLUSH(hw);
> +
> +       ixgbe_ipsec_set_rx_item(hw, idx, IXGBE_RXIDX_TBL_SPI);
> +
> +       /* store the key, salt, and mode */
> +       for (i = 0; i < 4; i++)
> +               IXGBE_WRITE_REG(hw, IXGBE_IPSRXKEY(i), cpu_to_be32(key[3-i]));
> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXSALT, cpu_to_be32(salt));
> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXMOD, mode);
> +       IXGBE_WRITE_FLUSH(hw);
> +
> +       ixgbe_ipsec_set_rx_item(hw, idx, IXGBE_RXIDX_TBL_KEY);
> +}

Is there any reason why you could write the SPI, key, salt, and mode,
then flush, and trigger the writes via the IPSRXIDX? Just wondering
since it would likely save you a few cycles avoiding PCIe bus stalls.

> +
> +/**
> + * ixgbe_ipsec_set_rx_ip - set up the register bits to save SA IP addr info
> + * @hw: hw specific details
> + * @idx: register index to write
> + * @addr: IP address byte array
> + **/
> +static void ixgbe_ipsec_set_rx_ip(struct ixgbe_hw *hw, u16 idx, u32 addr[])
> +{
> +       int i;
> +
> +       /* store the ip address */
> +       for (i = 0; i < 4; i++)
> +               IXGBE_WRITE_REG(hw, IXGBE_IPSRXIPADDR(i), addr[i]);
> +       IXGBE_WRITE_FLUSH(hw);
> +
> +       ixgbe_ipsec_set_rx_item(hw, idx, IXGBE_RXIDX_TBL_IP);
> +}
> +

This piece is kind of confusing. I would suggest storing the address
as a __be32 pointer instead of a u32 array. That way you start with
either an IPv6 or an IPv4 address at offset 0 instead of the way the
hardware is defined which has you writing it at either 0 or 3
depending on if the address is IPv6 or IPv4.

> +/**
> + * ixgbe_ipsec_clear_hw_tables - because some tables don't get cleared on reset
> + * @adapter: board private structure
> + **/
> +void ixgbe_ipsec_clear_hw_tables(struct ixgbe_adapter *adapter)
> +{
> +       struct ixgbe_hw *hw = &adapter->hw;
> +       u32 buf[4] = {0, 0, 0, 0};
> +       u16 idx;
> +
> +       /* disable Rx and Tx SA lookup */
> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXIDX, 0);
> +       IXGBE_WRITE_REG(hw, IXGBE_IPSTXIDX, 0);
> +
> +       /* scrub the tables */
> +       for (idx = 0; idx < IXGBE_IPSEC_MAX_SA_COUNT; idx++)
> +               ixgbe_ipsec_set_tx_sa(hw, idx, buf, 0);
> +
> +       for (idx = 0; idx < IXGBE_IPSEC_MAX_SA_COUNT; idx++)
> +               ixgbe_ipsec_set_rx_sa(hw, idx, 0, buf, 0, 0, 0);
> +
> +       for (idx = 0; idx < IXGBE_IPSEC_MAX_RX_IP_COUNT; idx++)
> +               ixgbe_ipsec_set_rx_ip(hw, idx, buf);
> +}
> +
> +/**
> + * ixgbe_init_ipsec_offload - initialize security registers for IPSec operation
> + * @adapter: board private structure
> + **/
> +void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter)
> +{
> +       ixgbe_ipsec_clear_hw_tables(adapter);
> +}
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
> new file mode 100644
> index 0000000..017b13f
> --- /dev/null
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
> @@ -0,0 +1,50 @@
> +/*******************************************************************************
> +
> +  Intel 10 Gigabit PCI Express Linux driver
> +  Copyright(c) 2017 Oracle and/or its affiliates. All rights reserved.
> +
> +  This program is free software; you can redistribute it and/or modify it
> +  under the terms and conditions of the GNU General Public License,
> +  version 2, as published by the Free Software Foundation.
> +
> +  This program is distributed in the hope it will be useful, but WITHOUT
> +  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> +  more details.
> +
> +  You should have received a copy of the GNU General Public License along with
> +  this program.  If not, see <http://www.gnu.org/licenses/>.
> +
> +  The full GNU General Public License is included in this distribution in
> +  the file called "COPYING".
> +
> +  Contact Information:
> +  Linux NICS <linux.nics@intel.com>
> +  e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
> +  Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
> +
> +*******************************************************************************/
> +
> +#ifndef _IXGBE_IPSEC_H_
> +#define _IXGBE_IPSEC_H_
> +
> +#define IXGBE_IPSEC_MAX_SA_COUNT       1024
> +#define IXGBE_IPSEC_MAX_RX_IP_COUNT    128
> +#define IXGBE_IPSEC_BASE_RX_INDEX      IXGBE_IPSEC_MAX_SA_COUNT
> +#define IXGBE_IPSEC_BASE_TX_INDEX      (IXGBE_IPSEC_MAX_SA_COUNT * 2)
> +
> +#define IXGBE_RXTXIDX_IPS_EN           0x00000001
> +#define IXGBE_RXIDX_TBL_MASK           0x00000006
> +#define IXGBE_RXIDX_TBL_IP             0x00000002
> +#define IXGBE_RXIDX_TBL_SPI            0x00000004
> +#define IXGBE_RXIDX_TBL_KEY            0x00000006

You might look at converting these table entries into an enum and add
a shift value. It will make things much easier to read.

> +#define IXGBE_RXTXIDX_IDX_MASK         0x00001ff8
> +#define IXGBE_RXTXIDX_IDX_READ         0x40000000
> +#define IXGBE_RXTXIDX_IDX_WRITE                0x80000000
> +
> +#define IXGBE_RXMOD_VALID              0x00000001
> +#define IXGBE_RXMOD_PROTO_ESP          0x00000004
> +#define IXGBE_RXMOD_DECRYPT            0x00000008
> +#define IXGBE_RXMOD_IPV6               0x00000010
> +
> +#endif /* _IXGBE_IPSEC_H_ */
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index 6d5f31e..51fb3cf 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -10327,6 +10327,7 @@ static int ixgbe_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>                                          NETIF_F_FCOE_MTU;
>         }
>  #endif /* IXGBE_FCOE */
> +       ixgbe_init_ipsec_offload(adapter);
>
>         if (adapter->flags2 & IXGBE_FLAG2_RSC_CAPABLE)
>                 netdev->hw_features |= NETIF_F_LRO;
> --
> 2.7.4
>
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan at osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [Intel-wired-lan] [next-queue 04/10] ixgbe: add ipsec data structures
  2017-12-05  5:35   ` [Intel-wired-lan] " Shannon Nelson
@ 2017-12-05 17:03     ` Alexander Duyck
  -1 siblings, 0 replies; 78+ messages in thread
From: Alexander Duyck @ 2017-12-05 17:03 UTC (permalink / raw)
  To: Shannon Nelson
  Cc: intel-wired-lan, Jeff Kirsher, Steffen Klassert,
	Sowmini Varadhan, Netdev

On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
<shannon.nelson@oracle.com> wrote:
> Set up the data structures to be used by the ipsec offload.
>
> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
> ---
>  drivers/net/ethernet/intel/ixgbe/ixgbe.h       |  5 ++++
>  drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h | 40 ++++++++++++++++++++++++++
>  2 files changed, 45 insertions(+)
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> index 1e11462..9487750 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> @@ -622,6 +622,7 @@ struct ixgbe_adapter {
>  #define IXGBE_FLAG2_EEE_CAPABLE                        BIT(14)
>  #define IXGBE_FLAG2_EEE_ENABLED                        BIT(15)
>  #define IXGBE_FLAG2_RX_LEGACY                  BIT(16)
> +#define IXGBE_FLAG2_IPSEC_ENABLED              BIT(17)
>
>         /* Tx fast path data */
>         int num_tx_queues;
> @@ -772,6 +773,10 @@ struct ixgbe_adapter {
>
>  #define IXGBE_RSS_KEY_SIZE     40  /* size of RSS Hash Key in bytes */
>         u32 *rss_key;
> +
> +#ifdef CONFIG_XFRM
> +       struct ixgbe_ipsec *ipsec;
> +#endif /* CONFIG_XFRM */
>  };
>
>  static inline u8 ixgbe_max_rss_indices(struct ixgbe_adapter *adapter)
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
> index 017b13f..cb9a4be 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
> @@ -47,4 +47,44 @@
>  #define IXGBE_RXMOD_DECRYPT            0x00000008
>  #define IXGBE_RXMOD_IPV6               0x00000010
>
> +struct rx_sa {
> +       struct hlist_node hlist;
> +       struct xfrm_state *xs;
> +       u32 ipaddr[4];

ipaddr should be stored as a __be32, not a u32.

> +       u32 key[4];
> +       u32 salt;
> +       u32 mode;
> +       u8  iptbl_ind;
> +       bool used;
> +       bool decrypt;
> +};
> +
> +struct rx_ip_sa {
> +       u32 ipaddr[4];

Same thing here.

> +       u32 ref_cnt;
> +       bool used;
> +};
> +
> +struct tx_sa {
> +       struct xfrm_state *xs;
> +       u32 key[4];
> +       u32 salt;
> +       bool encrypt;
> +       bool used;
> +};
> +
> +struct ixgbe_ipsec_tx_data {
> +       u32 flags;
> +       u16 trailer_len;
> +       u16 sa_idx;
> +};
> +
> +struct ixgbe_ipsec {
> +       u16 num_rx_sa;
> +       u16 num_tx_sa;
> +       struct rx_ip_sa *ip_tbl;
> +       struct rx_sa *rx_tbl;
> +       struct tx_sa *tx_tbl;
> +       DECLARE_HASHTABLE(rx_sa_list, 8);

The hash table seems a bit on the small side. You might look at
increasing this to something like 32 in order to try and cut down on
the load in each bucket since the upper limit is 1K or so isn't it?

> +};
>  #endif /* _IXGBE_IPSEC_H_ */
> --
> 2.7.4
>
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan@osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 04/10] ixgbe: add ipsec data structures
@ 2017-12-05 17:03     ` Alexander Duyck
  0 siblings, 0 replies; 78+ messages in thread
From: Alexander Duyck @ 2017-12-05 17:03 UTC (permalink / raw)
  To: intel-wired-lan

On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
<shannon.nelson@oracle.com> wrote:
> Set up the data structures to be used by the ipsec offload.
>
> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
> ---
>  drivers/net/ethernet/intel/ixgbe/ixgbe.h       |  5 ++++
>  drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h | 40 ++++++++++++++++++++++++++
>  2 files changed, 45 insertions(+)
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> index 1e11462..9487750 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> @@ -622,6 +622,7 @@ struct ixgbe_adapter {
>  #define IXGBE_FLAG2_EEE_CAPABLE                        BIT(14)
>  #define IXGBE_FLAG2_EEE_ENABLED                        BIT(15)
>  #define IXGBE_FLAG2_RX_LEGACY                  BIT(16)
> +#define IXGBE_FLAG2_IPSEC_ENABLED              BIT(17)
>
>         /* Tx fast path data */
>         int num_tx_queues;
> @@ -772,6 +773,10 @@ struct ixgbe_adapter {
>
>  #define IXGBE_RSS_KEY_SIZE     40  /* size of RSS Hash Key in bytes */
>         u32 *rss_key;
> +
> +#ifdef CONFIG_XFRM
> +       struct ixgbe_ipsec *ipsec;
> +#endif /* CONFIG_XFRM */
>  };
>
>  static inline u8 ixgbe_max_rss_indices(struct ixgbe_adapter *adapter)
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
> index 017b13f..cb9a4be 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
> @@ -47,4 +47,44 @@
>  #define IXGBE_RXMOD_DECRYPT            0x00000008
>  #define IXGBE_RXMOD_IPV6               0x00000010
>
> +struct rx_sa {
> +       struct hlist_node hlist;
> +       struct xfrm_state *xs;
> +       u32 ipaddr[4];

ipaddr should be stored as a __be32, not a u32.

> +       u32 key[4];
> +       u32 salt;
> +       u32 mode;
> +       u8  iptbl_ind;
> +       bool used;
> +       bool decrypt;
> +};
> +
> +struct rx_ip_sa {
> +       u32 ipaddr[4];

Same thing here.

> +       u32 ref_cnt;
> +       bool used;
> +};
> +
> +struct tx_sa {
> +       struct xfrm_state *xs;
> +       u32 key[4];
> +       u32 salt;
> +       bool encrypt;
> +       bool used;
> +};
> +
> +struct ixgbe_ipsec_tx_data {
> +       u32 flags;
> +       u16 trailer_len;
> +       u16 sa_idx;
> +};
> +
> +struct ixgbe_ipsec {
> +       u16 num_rx_sa;
> +       u16 num_tx_sa;
> +       struct rx_ip_sa *ip_tbl;
> +       struct rx_sa *rx_tbl;
> +       struct tx_sa *tx_tbl;
> +       DECLARE_HASHTABLE(rx_sa_list, 8);

The hash table seems a bit on the small side. You might look at
increasing this to something like 32 in order to try and cut down on
the load in each bucket since the upper limit is 1K or so isn't it?

> +};
>  #endif /* _IXGBE_IPSEC_H_ */
> --
> 2.7.4
>
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan at osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [Intel-wired-lan] [next-queue 05/10] ixgbe: implement ipsec add and remove of offloaded SA
  2017-12-05  5:35   ` [Intel-wired-lan] " Shannon Nelson
@ 2017-12-05 17:26     ` Alexander Duyck
  -1 siblings, 0 replies; 78+ messages in thread
From: Alexander Duyck @ 2017-12-05 17:26 UTC (permalink / raw)
  To: Shannon Nelson
  Cc: intel-wired-lan, Jeff Kirsher, Steffen Klassert,
	Sowmini Varadhan, Netdev

On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
<shannon.nelson@oracle.com> wrote:
> Add the functions for setting up and removing offloaded SAs (Security
> Associations) with the x540 hardware.  We set up the callback structure
> but we don't yet set the hardware feature bit to be sure the XFRM service
> won't actually try to use us for an offload yet.
>
> The software tables are made up to mimic the hardware tables to make it
> easier to track what's in the hardware, and the SA table index is used
> for the XFRM offload handle.  However, there is a hashing field in the
> Rx SA tracking that will be used to facilitate faster table searches in
> the Rx fast path.
>
> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
> ---
>  drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 377 +++++++++++++++++++++++++
>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  |   6 +
>  2 files changed, 383 insertions(+)
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> index 38a1a16..7b01d92 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> @@ -26,6 +26,8 @@
>   ******************************************************************************/
>
>  #include "ixgbe.h"
> +#include <net/xfrm.h>
> +#include <crypto/aead.h>
>
>  /**
>   * ixgbe_ipsec_set_tx_sa - set the Tx SA registers
> @@ -128,6 +130,7 @@ static void ixgbe_ipsec_set_rx_ip(struct ixgbe_hw *hw, u16 idx, u32 addr[])
>   **/
>  void ixgbe_ipsec_clear_hw_tables(struct ixgbe_adapter *adapter)
>  {
> +       struct ixgbe_ipsec *ipsec = adapter->ipsec;
>         struct ixgbe_hw *hw = &adapter->hw;
>         u32 buf[4] = {0, 0, 0, 0};
>         u16 idx;
> @@ -139,9 +142,11 @@ void ixgbe_ipsec_clear_hw_tables(struct ixgbe_adapter *adapter)
>         /* scrub the tables */
>         for (idx = 0; idx < IXGBE_IPSEC_MAX_SA_COUNT; idx++)
>                 ixgbe_ipsec_set_tx_sa(hw, idx, buf, 0);
> +       ipsec->num_tx_sa = 0;
>
>         for (idx = 0; idx < IXGBE_IPSEC_MAX_SA_COUNT; idx++)
>                 ixgbe_ipsec_set_rx_sa(hw, idx, 0, buf, 0, 0, 0);
> +       ipsec->num_rx_sa = 0;
>
>         for (idx = 0; idx < IXGBE_IPSEC_MAX_RX_IP_COUNT; idx++)
>                 ixgbe_ipsec_set_rx_ip(hw, idx, buf);
> @@ -287,11 +292,383 @@ static void ixgbe_ipsec_start_engine(struct ixgbe_adapter *adapter)
>  }
>
>  /**
> + * ixgbe_ipsec_find_empty_idx - find the first unused security parameter index
> + * @ipsec: pointer to ipsec struct
> + * @rxtable: true if we need to look in the Rx table
> + *
> + * Returns the first unused index in either the Rx or Tx SA table
> + **/
> +static int ixgbe_ipsec_find_empty_idx(struct ixgbe_ipsec *ipsec, bool rxtable)
> +{
> +       u32 i;
> +
> +       if (rxtable) {
> +               if (ipsec->num_rx_sa == IXGBE_IPSEC_MAX_SA_COUNT)
> +                       return -ENOSPC;
> +
> +               /* search rx sa table */
> +               for (i = 0; i < IXGBE_IPSEC_MAX_SA_COUNT; i++) {
> +                       if (!ipsec->rx_tbl[i].used)
> +                               return i;
> +               }
> +       } else {
> +               if (ipsec->num_rx_sa == IXGBE_IPSEC_MAX_SA_COUNT)
> +                       return -ENOSPC;

Should this bi num_tx_sa?

> +
> +               /* search tx sa table */
> +               for (i = 0; i < IXGBE_IPSEC_MAX_SA_COUNT; i++) {
> +                       if (!ipsec->tx_tbl[i].used)
> +                               return i;
> +               }
> +       }
> +
> +       return -ENOSPC;
> +}
> +
> +/**
> + * ixgbe_ipsec_parse_proto_keys - find the key and salt based on the protocol
> + * @xs: pointer to xfrm_state struct
> + * @mykey: pointer to key array to populate
> + * @mysalt: pointer to salt value to populate
> + *
> + * This copies the protocol keys and salt to our own data tables.  The
> + * 82599 family only supports the one algorithm.
> + **/
> +static int ixgbe_ipsec_parse_proto_keys(struct xfrm_state *xs,
> +                                       u32 *mykey, u32 *mysalt)
> +{
> +       struct net_device *dev = xs->xso.dev;
> +       unsigned char *key_data;
> +       char *alg_name = NULL;
> +       char *aes_gcm_name = "rfc4106(gcm(aes))";

aes_gcm_name should probably be a static const char array instead of a pointer.

> +       int key_len;
> +
> +       if (xs->aead) {
> +               key_data = &xs->aead->alg_key[0];
> +               key_len = xs->aead->alg_key_len;
> +               alg_name = xs->aead->alg_name;
> +       } else {
> +               netdev_err(dev, "Unsupported IPsec algorithm\n");
> +               return -EINVAL;
> +       }
> +
> +       if (strcmp(alg_name, aes_gcm_name)) {
> +               netdev_err(dev, "Unsupported IPsec algorithm - please use %s\n",
> +                          aes_gcm_name);
> +               return -EINVAL;
> +       }
> +
> +       /* 160 accounts for 16 byte key and 4 byte salt */
> +       if (key_len == 128) {
> +               netdev_info(dev, "IPsec hw offload parameters missing 32 bit salt value\n");
> +       } else if (key_len != 160) {
> +               netdev_err(dev, "IPsec hw offload only supports keys up to 128 bits with a 32 bit salt\n");
> +               return -EINVAL;
> +       }
> +
> +       /* The key bytes come down in a bigendian array of bytes, and
> +        * salt is always the last 4 bytes of the key array.
> +        * We don't need to do any byteswapping.
> +        */
> +       memcpy(mykey, key_data, 16);
> +       if (key_len == 160)
> +               *mysalt = ((u32 *)key_data)[4];
> +       else
> +               *mysalt = 0;

You could combine these key_len checks into a single if/else set.
Basically just do something like the following:

/* 160 accounts for 16 byte key and 4 byte salt */
if (key_len == 160) {
         *mysalt = ((u32 *)key_data)[4];
} else if (key_len != 128) {
        netdev_err(dev, "IPsec hw offload only supports keys up to 128
bits with a 32 bit salt\n");
        return -EINVAL;
} else {
        netdev_info(dev, "IPsec hw offload parameters missing 32 bit
salt value\n");
        *mysalt = 0;
}

 /* The key bytes come down in a bigendian array of bytes, and
  * salt is always the last 4 bytes of the key array.
  * We don't need to do any byteswapping.
  */
memcpy(mykey, key_data, 16);

> +
> +       return 0;
> +}
> +
> +/**
> + * ixgbe_ipsec_add_sa - program device with a security association
> + * @xs: pointer to transformer state struct
> + **/
> +static int ixgbe_ipsec_add_sa(struct xfrm_state *xs)
> +{
> +       struct net_device *dev = xs->xso.dev;
> +       struct ixgbe_adapter *adapter = netdev_priv(dev);
> +       struct ixgbe_ipsec *ipsec = adapter->ipsec;
> +       struct ixgbe_hw *hw = &adapter->hw;
> +       int checked, match, first;
> +       u16 sa_idx;
> +       int ret;
> +       int i;
> +
> +       if (xs->id.proto != IPPROTO_ESP && xs->id.proto != IPPROTO_AH) {
> +               netdev_err(dev, "Unsupported protocol 0x%04x for ipsec offload\n",
> +                          xs->id.proto);
> +               return -EINVAL;
> +       }
> +
> +       if (xs->xso.flags & XFRM_OFFLOAD_INBOUND) {
> +               struct rx_sa rsa;
> +
> +               if (xs->calg) {
> +                       netdev_err(dev, "Compression offload not supported\n");
> +                       return -EINVAL;
> +               }
> +
> +               /* find the first unused index */
> +               ret = ixgbe_ipsec_find_empty_idx(ipsec, true);
> +               if (ret < 0) {
> +                       netdev_err(dev, "No space for SA in Rx table!\n");
> +                       return ret;
> +               }
> +               sa_idx = (u16)ret;
> +
> +               memset(&rsa, 0, sizeof(rsa));
> +               rsa.used = true;
> +               rsa.xs = xs;
> +
> +               if (rsa.xs->id.proto & IPPROTO_ESP)
> +                       rsa.decrypt = xs->ealg || xs->aead;
> +
> +               /* get the key and salt */
> +               ret = ixgbe_ipsec_parse_proto_keys(xs, rsa.key, &rsa.salt);
> +               if (ret) {
> +                       netdev_err(dev, "Failed to get key data for Rx SA table\n");
> +                       return ret;
> +               }
> +
> +               /* get ip for rx sa table */
> +               if (xs->xso.flags & XFRM_OFFLOAD_IPV6)
> +                       memcpy(rsa.ipaddr, &xs->id.daddr.a6, 16);
> +               else
> +                       memcpy(&rsa.ipaddr[3], &xs->id.daddr.a4, 4);
> +
> +               /* The HW does not have a 1:1 mapping from keys to IP addrs, so
> +                * check for a matching IP addr entry in the table.  If the addr
> +                * already exists, use it; else find an unused slot and add the
> +                * addr.  If one does not exist and there are no unused table
> +                * entries, fail the request.
> +                */
> +
> +               /* Find an existing match or first not used, and stop looking
> +                * after we've checked all we know we have.
> +                */
> +               checked = 0;
> +               match = -1;
> +               first = -1;
> +               for (i = 0;
> +                    i < IXGBE_IPSEC_MAX_RX_IP_COUNT &&
> +                    (checked < ipsec->num_rx_sa || first < 0);
> +                    i++) {
> +                       if (ipsec->ip_tbl[i].used) {
> +                               if (!memcmp(ipsec->ip_tbl[i].ipaddr,
> +                                           rsa.ipaddr, sizeof(rsa.ipaddr))) {
> +                                       match = i;
> +                                       break;
> +                               }
> +                               checked++;
> +                       } else if (first < 0) {
> +                               first = i;  /* track the first empty seen */
> +                       }
> +               }
> +
> +               if (ipsec->num_rx_sa == 0)
> +                       first = 0;
> +
> +               if (match >= 0) {
> +                       /* addrs are the same, we should use this one */
> +                       rsa.iptbl_ind = match;
> +                       ipsec->ip_tbl[match].ref_cnt++;
> +
> +               } else if (first >= 0) {
> +                       /* no matches, but here's an empty slot */
> +                       rsa.iptbl_ind = first;
> +
> +                       memcpy(ipsec->ip_tbl[first].ipaddr,
> +                              rsa.ipaddr, sizeof(rsa.ipaddr));
> +                       ipsec->ip_tbl[first].ref_cnt = 1;
> +                       ipsec->ip_tbl[first].used = true;
> +
> +                       ixgbe_ipsec_set_rx_ip(hw, rsa.iptbl_ind, rsa.ipaddr);
> +
> +               } else {
> +                       /* no match and no empty slot */
> +                       netdev_err(dev, "No space for SA in Rx IP SA table\n");
> +                       memset(&rsa, 0, sizeof(rsa));
> +                       return -ENOSPC;
> +               }
> +
> +               rsa.mode = IXGBE_RXMOD_VALID;
> +               if (rsa.xs->id.proto & IPPROTO_ESP)
> +                       rsa.mode |= IXGBE_RXMOD_PROTO_ESP;
> +               if (rsa.decrypt)
> +                       rsa.mode |= IXGBE_RXMOD_DECRYPT;
> +               if (rsa.xs->xso.flags & XFRM_OFFLOAD_IPV6)
> +                       rsa.mode |= IXGBE_RXMOD_IPV6;
> +
> +               /* the preparations worked, so save the info */
> +               memcpy(&ipsec->rx_tbl[sa_idx], &rsa, sizeof(rsa));
> +
> +               ixgbe_ipsec_set_rx_sa(hw, sa_idx, rsa.xs->id.spi, rsa.key,
> +                                     rsa.salt, rsa.mode, rsa.iptbl_ind);
> +               xs->xso.offload_handle = sa_idx + IXGBE_IPSEC_BASE_RX_INDEX;
> +
> +               ipsec->num_rx_sa++;
> +
> +               /* hash the new entry for faster search in Rx path */
> +               hash_add_rcu(ipsec->rx_sa_list, &ipsec->rx_tbl[sa_idx].hlist,
> +                            rsa.xs->id.spi);
> +       } else {
> +               struct tx_sa tsa;
> +
> +               /* find the first unused index */
> +               ret = ixgbe_ipsec_find_empty_idx(ipsec, false);
> +               if (ret < 0) {
> +                       netdev_err(dev, "No space for SA in Tx table\n");
> +                       return ret;
> +               }
> +               sa_idx = (u16)ret;
> +
> +               memset(&tsa, 0, sizeof(tsa));
> +               tsa.used = true;
> +               tsa.xs = xs;
> +
> +               if (xs->id.proto & IPPROTO_ESP)
> +                       tsa.encrypt = xs->ealg || xs->aead;
> +
> +               ret = ixgbe_ipsec_parse_proto_keys(xs, tsa.key, &tsa.salt);
> +               if (ret) {
> +                       netdev_err(dev, "Failed to get key data for Tx SA table\n");
> +                       memset(&tsa, 0, sizeof(tsa));
> +                       return ret;
> +               }
> +
> +               /* the preparations worked, so save the info */
> +               memcpy(&ipsec->tx_tbl[sa_idx], &tsa, sizeof(tsa));
> +
> +               ixgbe_ipsec_set_tx_sa(hw, sa_idx, tsa.key, tsa.salt);
> +
> +               xs->xso.offload_handle = sa_idx + IXGBE_IPSEC_BASE_TX_INDEX;
> +
> +               ipsec->num_tx_sa++;
> +       }
> +
> +       /* enable the engine if not already warmed up */
> +       if (!(adapter->flags2 & IXGBE_FLAG2_IPSEC_ENABLED)) {
> +               ixgbe_ipsec_start_engine(adapter);
> +               adapter->flags2 |= IXGBE_FLAG2_IPSEC_ENABLED;
> +       }
> +
> +       return 0;
> +}
> +
> +/**
> + * ixgbe_ipsec_del_sa - clear out this specific SA
> + * @xs: pointer to transformer state struct
> + **/
> +static void ixgbe_ipsec_del_sa(struct xfrm_state *xs)
> +{
> +       struct net_device *dev = xs->xso.dev;
> +       struct ixgbe_adapter *adapter = netdev_priv(dev);
> +       struct ixgbe_ipsec *ipsec = adapter->ipsec;
> +       struct ixgbe_hw *hw = &adapter->hw;
> +       u32 zerobuf[4] = {0, 0, 0, 0};
> +       u16 sa_idx;
> +
> +       if (xs->xso.flags & XFRM_OFFLOAD_INBOUND) {
> +               struct rx_sa *rsa;
> +               u8 ipi;
> +
> +               sa_idx = xs->xso.offload_handle - IXGBE_IPSEC_BASE_RX_INDEX;
> +               rsa = &ipsec->rx_tbl[sa_idx];
> +
> +               if (!rsa->used) {
> +                       netdev_err(dev, "Invalid Rx SA selected sa_idx=%d offload_handle=%lu\n",
> +                                  sa_idx, xs->xso.offload_handle);
> +                       return;
> +               }
> +
> +               ixgbe_ipsec_set_rx_sa(hw, sa_idx, 0, zerobuf, 0, 0, 0);
> +               hash_del_rcu(&rsa->hlist);
> +
> +               /* if the IP table entry is referenced by only this SA,
> +                * i.e. ref_cnt is only 1, clear the IP table entry as well
> +                */
> +               ipi = rsa->iptbl_ind;
> +               if (ipsec->ip_tbl[ipi].ref_cnt > 0) {
> +                       ipsec->ip_tbl[ipi].ref_cnt--;
> +
> +                       if (!ipsec->ip_tbl[ipi].ref_cnt) {
> +                               memset(&ipsec->ip_tbl[ipi], 0,
> +                                      sizeof(struct rx_ip_sa));
> +                               ixgbe_ipsec_set_rx_ip(hw, ipi, zerobuf);
> +                       }
> +               }
> +
> +               memset(rsa, 0, sizeof(struct rx_sa));
> +               ipsec->num_rx_sa--;
> +       } else {
> +               sa_idx = xs->xso.offload_handle - IXGBE_IPSEC_BASE_TX_INDEX;
> +
> +               if (!ipsec->tx_tbl[sa_idx].used) {
> +                       netdev_err(dev, "Invalid Tx SA selected sa_idx=%d offload_handle=%lu\n",
> +                                  sa_idx, xs->xso.offload_handle);
> +                       return;
> +               }
> +
> +               ixgbe_ipsec_set_tx_sa(hw, sa_idx, zerobuf, 0);
> +               memset(&ipsec->tx_tbl[sa_idx], 0, sizeof(struct tx_sa));
> +               ipsec->num_tx_sa--;
> +       }
> +
> +       /* if there are no SAs left, stop the engine to save energy */
> +       if (ipsec->num_rx_sa == 0 && ipsec->num_tx_sa == 0) {
> +               adapter->flags2 &= ~IXGBE_FLAG2_IPSEC_ENABLED;
> +               ixgbe_ipsec_stop_engine(adapter);
> +       }
> +}
> +
> +static const struct xfrmdev_ops ixgbe_xfrmdev_ops = {
> +       .xdo_dev_state_add = ixgbe_ipsec_add_sa,
> +       .xdo_dev_state_delete = ixgbe_ipsec_del_sa,
> +};
> +
> +/**
>   * ixgbe_init_ipsec_offload - initialize security registers for IPSec operation
>   * @adapter: board private structure
>   **/
>  void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter)
>  {
> +       struct ixgbe_ipsec *ipsec;
> +       size_t size;
> +
> +       ipsec = kzalloc(sizeof(*ipsec), GFP_KERNEL);
> +       if (!ipsec)
> +               goto err;

I would say just add another label to skip over the if statement you
added below.

> +       hash_init(ipsec->rx_sa_list);
> +
> +       size = sizeof(struct rx_sa) * IXGBE_IPSEC_MAX_SA_COUNT;
> +       ipsec->rx_tbl = kzalloc(size, GFP_KERNEL);
> +       if (!ipsec->rx_tbl)
> +               goto err;
> +
> +       size = sizeof(struct tx_sa) * IXGBE_IPSEC_MAX_SA_COUNT;
> +       ipsec->tx_tbl = kzalloc(size, GFP_KERNEL);
> +       if (!ipsec->tx_tbl)
> +               goto err;
> +
> +       size = sizeof(struct rx_ip_sa) * IXGBE_IPSEC_MAX_RX_IP_COUNT;
> +       ipsec->ip_tbl = kzalloc(size, GFP_KERNEL);
> +       if (!ipsec->ip_tbl)
> +               goto err;

Do all these tables need to be allocated separately? I'm just
wondering if we can get away with doing something like what we did
with the ixgbe_q_vector structure where you just allocate this as one
physical block of memory and just split it up into multiple chunks
with a separate pointer to each chunk. Doing that would cut down on
the exception handling needed since it would be a single allocation
failure you would have to deal with.

> +       ipsec->num_rx_sa = 0;
> +       ipsec->num_tx_sa = 0;
> +
> +       adapter->ipsec = ipsec;
>         ixgbe_ipsec_clear_hw_tables(adapter);
>         ixgbe_ipsec_stop_engine(adapter);
> +
> +       return;
> +err:
> +       if (ipsec) {
> +               kfree(ipsec->ip_tbl);
> +               kfree(ipsec->rx_tbl);
> +               kfree(ipsec->tx_tbl);
> +               kfree(adapter->ipsec);
> +       }
> +       netdev_err(adapter->netdev, "Unable to allocate memory for SA tables");
>  }
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index 51fb3cf..01fd89b 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -10542,6 +10542,12 @@ static void ixgbe_remove(struct pci_dev *pdev)
>         set_bit(__IXGBE_REMOVING, &adapter->state);
>         cancel_work_sync(&adapter->service_task);
>
> +#ifdef CONFIG_XFRM
> +       kfree(adapter->ipsec->ip_tbl);
> +       kfree(adapter->ipsec->rx_tbl);
> +       kfree(adapter->ipsec->tx_tbl);
> +       kfree(adapter->ipsec);
> +#endif /* CONFIG_XFRM */

It might be useful if you were to move this into a function of its
own. Also you should probably check for adapter->ipsec first,
otherwise you are going to cause NULL pointer dereference any time
adapter->ipsec isn't defined. because you are dereferencing it when
you go to free each of those tables.

>
>  #ifdef CONFIG_IXGBE_DCA
>         if (adapter->flags & IXGBE_FLAG_DCA_ENABLED) {
> --
> 2.7.4
>
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan@osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 05/10] ixgbe: implement ipsec add and remove of offloaded SA
@ 2017-12-05 17:26     ` Alexander Duyck
  0 siblings, 0 replies; 78+ messages in thread
From: Alexander Duyck @ 2017-12-05 17:26 UTC (permalink / raw)
  To: intel-wired-lan

On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
<shannon.nelson@oracle.com> wrote:
> Add the functions for setting up and removing offloaded SAs (Security
> Associations) with the x540 hardware.  We set up the callback structure
> but we don't yet set the hardware feature bit to be sure the XFRM service
> won't actually try to use us for an offload yet.
>
> The software tables are made up to mimic the hardware tables to make it
> easier to track what's in the hardware, and the SA table index is used
> for the XFRM offload handle.  However, there is a hashing field in the
> Rx SA tracking that will be used to facilitate faster table searches in
> the Rx fast path.
>
> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
> ---
>  drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 377 +++++++++++++++++++++++++
>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  |   6 +
>  2 files changed, 383 insertions(+)
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> index 38a1a16..7b01d92 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> @@ -26,6 +26,8 @@
>   ******************************************************************************/
>
>  #include "ixgbe.h"
> +#include <net/xfrm.h>
> +#include <crypto/aead.h>
>
>  /**
>   * ixgbe_ipsec_set_tx_sa - set the Tx SA registers
> @@ -128,6 +130,7 @@ static void ixgbe_ipsec_set_rx_ip(struct ixgbe_hw *hw, u16 idx, u32 addr[])
>   **/
>  void ixgbe_ipsec_clear_hw_tables(struct ixgbe_adapter *adapter)
>  {
> +       struct ixgbe_ipsec *ipsec = adapter->ipsec;
>         struct ixgbe_hw *hw = &adapter->hw;
>         u32 buf[4] = {0, 0, 0, 0};
>         u16 idx;
> @@ -139,9 +142,11 @@ void ixgbe_ipsec_clear_hw_tables(struct ixgbe_adapter *adapter)
>         /* scrub the tables */
>         for (idx = 0; idx < IXGBE_IPSEC_MAX_SA_COUNT; idx++)
>                 ixgbe_ipsec_set_tx_sa(hw, idx, buf, 0);
> +       ipsec->num_tx_sa = 0;
>
>         for (idx = 0; idx < IXGBE_IPSEC_MAX_SA_COUNT; idx++)
>                 ixgbe_ipsec_set_rx_sa(hw, idx, 0, buf, 0, 0, 0);
> +       ipsec->num_rx_sa = 0;
>
>         for (idx = 0; idx < IXGBE_IPSEC_MAX_RX_IP_COUNT; idx++)
>                 ixgbe_ipsec_set_rx_ip(hw, idx, buf);
> @@ -287,11 +292,383 @@ static void ixgbe_ipsec_start_engine(struct ixgbe_adapter *adapter)
>  }
>
>  /**
> + * ixgbe_ipsec_find_empty_idx - find the first unused security parameter index
> + * @ipsec: pointer to ipsec struct
> + * @rxtable: true if we need to look in the Rx table
> + *
> + * Returns the first unused index in either the Rx or Tx SA table
> + **/
> +static int ixgbe_ipsec_find_empty_idx(struct ixgbe_ipsec *ipsec, bool rxtable)
> +{
> +       u32 i;
> +
> +       if (rxtable) {
> +               if (ipsec->num_rx_sa == IXGBE_IPSEC_MAX_SA_COUNT)
> +                       return -ENOSPC;
> +
> +               /* search rx sa table */
> +               for (i = 0; i < IXGBE_IPSEC_MAX_SA_COUNT; i++) {
> +                       if (!ipsec->rx_tbl[i].used)
> +                               return i;
> +               }
> +       } else {
> +               if (ipsec->num_rx_sa == IXGBE_IPSEC_MAX_SA_COUNT)
> +                       return -ENOSPC;

Should this bi num_tx_sa?

> +
> +               /* search tx sa table */
> +               for (i = 0; i < IXGBE_IPSEC_MAX_SA_COUNT; i++) {
> +                       if (!ipsec->tx_tbl[i].used)
> +                               return i;
> +               }
> +       }
> +
> +       return -ENOSPC;
> +}
> +
> +/**
> + * ixgbe_ipsec_parse_proto_keys - find the key and salt based on the protocol
> + * @xs: pointer to xfrm_state struct
> + * @mykey: pointer to key array to populate
> + * @mysalt: pointer to salt value to populate
> + *
> + * This copies the protocol keys and salt to our own data tables.  The
> + * 82599 family only supports the one algorithm.
> + **/
> +static int ixgbe_ipsec_parse_proto_keys(struct xfrm_state *xs,
> +                                       u32 *mykey, u32 *mysalt)
> +{
> +       struct net_device *dev = xs->xso.dev;
> +       unsigned char *key_data;
> +       char *alg_name = NULL;
> +       char *aes_gcm_name = "rfc4106(gcm(aes))";

aes_gcm_name should probably be a static const char array instead of a pointer.

> +       int key_len;
> +
> +       if (xs->aead) {
> +               key_data = &xs->aead->alg_key[0];
> +               key_len = xs->aead->alg_key_len;
> +               alg_name = xs->aead->alg_name;
> +       } else {
> +               netdev_err(dev, "Unsupported IPsec algorithm\n");
> +               return -EINVAL;
> +       }
> +
> +       if (strcmp(alg_name, aes_gcm_name)) {
> +               netdev_err(dev, "Unsupported IPsec algorithm - please use %s\n",
> +                          aes_gcm_name);
> +               return -EINVAL;
> +       }
> +
> +       /* 160 accounts for 16 byte key and 4 byte salt */
> +       if (key_len == 128) {
> +               netdev_info(dev, "IPsec hw offload parameters missing 32 bit salt value\n");
> +       } else if (key_len != 160) {
> +               netdev_err(dev, "IPsec hw offload only supports keys up to 128 bits with a 32 bit salt\n");
> +               return -EINVAL;
> +       }
> +
> +       /* The key bytes come down in a bigendian array of bytes, and
> +        * salt is always the last 4 bytes of the key array.
> +        * We don't need to do any byteswapping.
> +        */
> +       memcpy(mykey, key_data, 16);
> +       if (key_len == 160)
> +               *mysalt = ((u32 *)key_data)[4];
> +       else
> +               *mysalt = 0;

You could combine these key_len checks into a single if/else set.
Basically just do something like the following:

/* 160 accounts for 16 byte key and 4 byte salt */
if (key_len == 160) {
         *mysalt = ((u32 *)key_data)[4];
} else if (key_len != 128) {
        netdev_err(dev, "IPsec hw offload only supports keys up to 128
bits with a 32 bit salt\n");
        return -EINVAL;
} else {
        netdev_info(dev, "IPsec hw offload parameters missing 32 bit
salt value\n");
        *mysalt = 0;
}

 /* The key bytes come down in a bigendian array of bytes, and
  * salt is always the last 4 bytes of the key array.
  * We don't need to do any byteswapping.
  */
memcpy(mykey, key_data, 16);

> +
> +       return 0;
> +}
> +
> +/**
> + * ixgbe_ipsec_add_sa - program device with a security association
> + * @xs: pointer to transformer state struct
> + **/
> +static int ixgbe_ipsec_add_sa(struct xfrm_state *xs)
> +{
> +       struct net_device *dev = xs->xso.dev;
> +       struct ixgbe_adapter *adapter = netdev_priv(dev);
> +       struct ixgbe_ipsec *ipsec = adapter->ipsec;
> +       struct ixgbe_hw *hw = &adapter->hw;
> +       int checked, match, first;
> +       u16 sa_idx;
> +       int ret;
> +       int i;
> +
> +       if (xs->id.proto != IPPROTO_ESP && xs->id.proto != IPPROTO_AH) {
> +               netdev_err(dev, "Unsupported protocol 0x%04x for ipsec offload\n",
> +                          xs->id.proto);
> +               return -EINVAL;
> +       }
> +
> +       if (xs->xso.flags & XFRM_OFFLOAD_INBOUND) {
> +               struct rx_sa rsa;
> +
> +               if (xs->calg) {
> +                       netdev_err(dev, "Compression offload not supported\n");
> +                       return -EINVAL;
> +               }
> +
> +               /* find the first unused index */
> +               ret = ixgbe_ipsec_find_empty_idx(ipsec, true);
> +               if (ret < 0) {
> +                       netdev_err(dev, "No space for SA in Rx table!\n");
> +                       return ret;
> +               }
> +               sa_idx = (u16)ret;
> +
> +               memset(&rsa, 0, sizeof(rsa));
> +               rsa.used = true;
> +               rsa.xs = xs;
> +
> +               if (rsa.xs->id.proto & IPPROTO_ESP)
> +                       rsa.decrypt = xs->ealg || xs->aead;
> +
> +               /* get the key and salt */
> +               ret = ixgbe_ipsec_parse_proto_keys(xs, rsa.key, &rsa.salt);
> +               if (ret) {
> +                       netdev_err(dev, "Failed to get key data for Rx SA table\n");
> +                       return ret;
> +               }
> +
> +               /* get ip for rx sa table */
> +               if (xs->xso.flags & XFRM_OFFLOAD_IPV6)
> +                       memcpy(rsa.ipaddr, &xs->id.daddr.a6, 16);
> +               else
> +                       memcpy(&rsa.ipaddr[3], &xs->id.daddr.a4, 4);
> +
> +               /* The HW does not have a 1:1 mapping from keys to IP addrs, so
> +                * check for a matching IP addr entry in the table.  If the addr
> +                * already exists, use it; else find an unused slot and add the
> +                * addr.  If one does not exist and there are no unused table
> +                * entries, fail the request.
> +                */
> +
> +               /* Find an existing match or first not used, and stop looking
> +                * after we've checked all we know we have.
> +                */
> +               checked = 0;
> +               match = -1;
> +               first = -1;
> +               for (i = 0;
> +                    i < IXGBE_IPSEC_MAX_RX_IP_COUNT &&
> +                    (checked < ipsec->num_rx_sa || first < 0);
> +                    i++) {
> +                       if (ipsec->ip_tbl[i].used) {
> +                               if (!memcmp(ipsec->ip_tbl[i].ipaddr,
> +                                           rsa.ipaddr, sizeof(rsa.ipaddr))) {
> +                                       match = i;
> +                                       break;
> +                               }
> +                               checked++;
> +                       } else if (first < 0) {
> +                               first = i;  /* track the first empty seen */
> +                       }
> +               }
> +
> +               if (ipsec->num_rx_sa == 0)
> +                       first = 0;
> +
> +               if (match >= 0) {
> +                       /* addrs are the same, we should use this one */
> +                       rsa.iptbl_ind = match;
> +                       ipsec->ip_tbl[match].ref_cnt++;
> +
> +               } else if (first >= 0) {
> +                       /* no matches, but here's an empty slot */
> +                       rsa.iptbl_ind = first;
> +
> +                       memcpy(ipsec->ip_tbl[first].ipaddr,
> +                              rsa.ipaddr, sizeof(rsa.ipaddr));
> +                       ipsec->ip_tbl[first].ref_cnt = 1;
> +                       ipsec->ip_tbl[first].used = true;
> +
> +                       ixgbe_ipsec_set_rx_ip(hw, rsa.iptbl_ind, rsa.ipaddr);
> +
> +               } else {
> +                       /* no match and no empty slot */
> +                       netdev_err(dev, "No space for SA in Rx IP SA table\n");
> +                       memset(&rsa, 0, sizeof(rsa));
> +                       return -ENOSPC;
> +               }
> +
> +               rsa.mode = IXGBE_RXMOD_VALID;
> +               if (rsa.xs->id.proto & IPPROTO_ESP)
> +                       rsa.mode |= IXGBE_RXMOD_PROTO_ESP;
> +               if (rsa.decrypt)
> +                       rsa.mode |= IXGBE_RXMOD_DECRYPT;
> +               if (rsa.xs->xso.flags & XFRM_OFFLOAD_IPV6)
> +                       rsa.mode |= IXGBE_RXMOD_IPV6;
> +
> +               /* the preparations worked, so save the info */
> +               memcpy(&ipsec->rx_tbl[sa_idx], &rsa, sizeof(rsa));
> +
> +               ixgbe_ipsec_set_rx_sa(hw, sa_idx, rsa.xs->id.spi, rsa.key,
> +                                     rsa.salt, rsa.mode, rsa.iptbl_ind);
> +               xs->xso.offload_handle = sa_idx + IXGBE_IPSEC_BASE_RX_INDEX;
> +
> +               ipsec->num_rx_sa++;
> +
> +               /* hash the new entry for faster search in Rx path */
> +               hash_add_rcu(ipsec->rx_sa_list, &ipsec->rx_tbl[sa_idx].hlist,
> +                            rsa.xs->id.spi);
> +       } else {
> +               struct tx_sa tsa;
> +
> +               /* find the first unused index */
> +               ret = ixgbe_ipsec_find_empty_idx(ipsec, false);
> +               if (ret < 0) {
> +                       netdev_err(dev, "No space for SA in Tx table\n");
> +                       return ret;
> +               }
> +               sa_idx = (u16)ret;
> +
> +               memset(&tsa, 0, sizeof(tsa));
> +               tsa.used = true;
> +               tsa.xs = xs;
> +
> +               if (xs->id.proto & IPPROTO_ESP)
> +                       tsa.encrypt = xs->ealg || xs->aead;
> +
> +               ret = ixgbe_ipsec_parse_proto_keys(xs, tsa.key, &tsa.salt);
> +               if (ret) {
> +                       netdev_err(dev, "Failed to get key data for Tx SA table\n");
> +                       memset(&tsa, 0, sizeof(tsa));
> +                       return ret;
> +               }
> +
> +               /* the preparations worked, so save the info */
> +               memcpy(&ipsec->tx_tbl[sa_idx], &tsa, sizeof(tsa));
> +
> +               ixgbe_ipsec_set_tx_sa(hw, sa_idx, tsa.key, tsa.salt);
> +
> +               xs->xso.offload_handle = sa_idx + IXGBE_IPSEC_BASE_TX_INDEX;
> +
> +               ipsec->num_tx_sa++;
> +       }
> +
> +       /* enable the engine if not already warmed up */
> +       if (!(adapter->flags2 & IXGBE_FLAG2_IPSEC_ENABLED)) {
> +               ixgbe_ipsec_start_engine(adapter);
> +               adapter->flags2 |= IXGBE_FLAG2_IPSEC_ENABLED;
> +       }
> +
> +       return 0;
> +}
> +
> +/**
> + * ixgbe_ipsec_del_sa - clear out this specific SA
> + * @xs: pointer to transformer state struct
> + **/
> +static void ixgbe_ipsec_del_sa(struct xfrm_state *xs)
> +{
> +       struct net_device *dev = xs->xso.dev;
> +       struct ixgbe_adapter *adapter = netdev_priv(dev);
> +       struct ixgbe_ipsec *ipsec = adapter->ipsec;
> +       struct ixgbe_hw *hw = &adapter->hw;
> +       u32 zerobuf[4] = {0, 0, 0, 0};
> +       u16 sa_idx;
> +
> +       if (xs->xso.flags & XFRM_OFFLOAD_INBOUND) {
> +               struct rx_sa *rsa;
> +               u8 ipi;
> +
> +               sa_idx = xs->xso.offload_handle - IXGBE_IPSEC_BASE_RX_INDEX;
> +               rsa = &ipsec->rx_tbl[sa_idx];
> +
> +               if (!rsa->used) {
> +                       netdev_err(dev, "Invalid Rx SA selected sa_idx=%d offload_handle=%lu\n",
> +                                  sa_idx, xs->xso.offload_handle);
> +                       return;
> +               }
> +
> +               ixgbe_ipsec_set_rx_sa(hw, sa_idx, 0, zerobuf, 0, 0, 0);
> +               hash_del_rcu(&rsa->hlist);
> +
> +               /* if the IP table entry is referenced by only this SA,
> +                * i.e. ref_cnt is only 1, clear the IP table entry as well
> +                */
> +               ipi = rsa->iptbl_ind;
> +               if (ipsec->ip_tbl[ipi].ref_cnt > 0) {
> +                       ipsec->ip_tbl[ipi].ref_cnt--;
> +
> +                       if (!ipsec->ip_tbl[ipi].ref_cnt) {
> +                               memset(&ipsec->ip_tbl[ipi], 0,
> +                                      sizeof(struct rx_ip_sa));
> +                               ixgbe_ipsec_set_rx_ip(hw, ipi, zerobuf);
> +                       }
> +               }
> +
> +               memset(rsa, 0, sizeof(struct rx_sa));
> +               ipsec->num_rx_sa--;
> +       } else {
> +               sa_idx = xs->xso.offload_handle - IXGBE_IPSEC_BASE_TX_INDEX;
> +
> +               if (!ipsec->tx_tbl[sa_idx].used) {
> +                       netdev_err(dev, "Invalid Tx SA selected sa_idx=%d offload_handle=%lu\n",
> +                                  sa_idx, xs->xso.offload_handle);
> +                       return;
> +               }
> +
> +               ixgbe_ipsec_set_tx_sa(hw, sa_idx, zerobuf, 0);
> +               memset(&ipsec->tx_tbl[sa_idx], 0, sizeof(struct tx_sa));
> +               ipsec->num_tx_sa--;
> +       }
> +
> +       /* if there are no SAs left, stop the engine to save energy */
> +       if (ipsec->num_rx_sa == 0 && ipsec->num_tx_sa == 0) {
> +               adapter->flags2 &= ~IXGBE_FLAG2_IPSEC_ENABLED;
> +               ixgbe_ipsec_stop_engine(adapter);
> +       }
> +}
> +
> +static const struct xfrmdev_ops ixgbe_xfrmdev_ops = {
> +       .xdo_dev_state_add = ixgbe_ipsec_add_sa,
> +       .xdo_dev_state_delete = ixgbe_ipsec_del_sa,
> +};
> +
> +/**
>   * ixgbe_init_ipsec_offload - initialize security registers for IPSec operation
>   * @adapter: board private structure
>   **/
>  void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter)
>  {
> +       struct ixgbe_ipsec *ipsec;
> +       size_t size;
> +
> +       ipsec = kzalloc(sizeof(*ipsec), GFP_KERNEL);
> +       if (!ipsec)
> +               goto err;

I would say just add another label to skip over the if statement you
added below.

> +       hash_init(ipsec->rx_sa_list);
> +
> +       size = sizeof(struct rx_sa) * IXGBE_IPSEC_MAX_SA_COUNT;
> +       ipsec->rx_tbl = kzalloc(size, GFP_KERNEL);
> +       if (!ipsec->rx_tbl)
> +               goto err;
> +
> +       size = sizeof(struct tx_sa) * IXGBE_IPSEC_MAX_SA_COUNT;
> +       ipsec->tx_tbl = kzalloc(size, GFP_KERNEL);
> +       if (!ipsec->tx_tbl)
> +               goto err;
> +
> +       size = sizeof(struct rx_ip_sa) * IXGBE_IPSEC_MAX_RX_IP_COUNT;
> +       ipsec->ip_tbl = kzalloc(size, GFP_KERNEL);
> +       if (!ipsec->ip_tbl)
> +               goto err;

Do all these tables need to be allocated separately? I'm just
wondering if we can get away with doing something like what we did
with the ixgbe_q_vector structure where you just allocate this as one
physical block of memory and just split it up into multiple chunks
with a separate pointer to each chunk. Doing that would cut down on
the exception handling needed since it would be a single allocation
failure you would have to deal with.

> +       ipsec->num_rx_sa = 0;
> +       ipsec->num_tx_sa = 0;
> +
> +       adapter->ipsec = ipsec;
>         ixgbe_ipsec_clear_hw_tables(adapter);
>         ixgbe_ipsec_stop_engine(adapter);
> +
> +       return;
> +err:
> +       if (ipsec) {
> +               kfree(ipsec->ip_tbl);
> +               kfree(ipsec->rx_tbl);
> +               kfree(ipsec->tx_tbl);
> +               kfree(adapter->ipsec);
> +       }
> +       netdev_err(adapter->netdev, "Unable to allocate memory for SA tables");
>  }
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index 51fb3cf..01fd89b 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -10542,6 +10542,12 @@ static void ixgbe_remove(struct pci_dev *pdev)
>         set_bit(__IXGBE_REMOVING, &adapter->state);
>         cancel_work_sync(&adapter->service_task);
>
> +#ifdef CONFIG_XFRM
> +       kfree(adapter->ipsec->ip_tbl);
> +       kfree(adapter->ipsec->rx_tbl);
> +       kfree(adapter->ipsec->tx_tbl);
> +       kfree(adapter->ipsec);
> +#endif /* CONFIG_XFRM */

It might be useful if you were to move this into a function of its
own. Also you should probably check for adapter->ipsec first,
otherwise you are going to cause NULL pointer dereference any time
adapter->ipsec isn't defined. because you are dereferencing it when
you go to free each of those tables.

>
>  #ifdef CONFIG_IXGBE_DCA
>         if (adapter->flags & IXGBE_FLAG_DCA_ENABLED) {
> --
> 2.7.4
>
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan at osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [Intel-wired-lan] [next-queue 06/10] ixgbe: restore offloaded SAs after a reset
  2017-12-05  5:35   ` [Intel-wired-lan] " Shannon Nelson
@ 2017-12-05 17:30     ` Alexander Duyck
  -1 siblings, 0 replies; 78+ messages in thread
From: Alexander Duyck @ 2017-12-05 17:30 UTC (permalink / raw)
  To: Shannon Nelson
  Cc: intel-wired-lan, Jeff Kirsher, Steffen Klassert,
	Sowmini Varadhan, Netdev

On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
<shannon.nelson@oracle.com> wrote:
> On a chip reset most of the table contents are lost, so must be
> restored.  This scans the driver's ipsec tables and restores both
> the filled and empty table slots to their pre-reset values.
>
> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
> ---
>  drivers/net/ethernet/intel/ixgbe/ixgbe.h       |  2 +
>  drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 53 ++++++++++++++++++++++++++
>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  |  1 +
>  3 files changed, 56 insertions(+)
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> index 9487750..7e8bca7 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> @@ -1009,7 +1009,9 @@ s32 ixgbe_negotiate_fc(struct ixgbe_hw *hw, u32 adv_reg, u32 lp_reg,
>                        u32 adv_sym, u32 adv_asm, u32 lp_sym, u32 lp_asm);
>  #ifdef CONFIG_XFRM_OFFLOAD
>  void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter);
> +void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter);
>  #else
>  static inline void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter) { };
> +static inline void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter) { };
>  #endif /* CONFIG_XFRM_OFFLOAD */
>  #endif /* _IXGBE_H_ */
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> index 7b01d92..b93ee7f 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> @@ -292,6 +292,59 @@ static void ixgbe_ipsec_start_engine(struct ixgbe_adapter *adapter)
>  }
>
>  /**
> + * ixgbe_ipsec_restore - restore the ipsec HW settings after a reset
> + * @adapter: board private structure
> + **/
> +void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter)
> +{
> +       struct ixgbe_ipsec *ipsec = adapter->ipsec;
> +       struct ixgbe_hw *hw = &adapter->hw;
> +       u32 zbuf[4] = {0, 0, 0, 0};

zbuf should be a static const.

> +       int i;
> +
> +       if (!(adapter->flags2 & IXGBE_FLAG2_IPSEC_ENABLED))
> +               return;
> +
> +       /* clean up the engine settings */
> +       ixgbe_ipsec_stop_engine(adapter);
> +
> +       /* start the engine */
> +       ixgbe_ipsec_start_engine(adapter);
> +
> +       /* reload the IP addrs */
> +       for (i = 0; i < IXGBE_IPSEC_MAX_RX_IP_COUNT; i++) {
> +               struct rx_ip_sa *ipsa = &ipsec->ip_tbl[i];
> +
> +               if (ipsa->used)
> +                       ixgbe_ipsec_set_rx_ip(hw, i, ipsa->ipaddr);
> +               else
> +                       ixgbe_ipsec_set_rx_ip(hw, i, zbuf);

If we are doing a restore do we actually need to write the zero
values? If we did a reset I thought you had a function that was going
through and zeroing everything out. If so this now becomes redundant.

> +       }
> +
> +       /* reload the Rx keys */
> +       for (i = 0; i < IXGBE_IPSEC_MAX_SA_COUNT; i++) {
> +               struct rx_sa *rsa = &ipsec->rx_tbl[i];
> +
> +               if (rsa->used)
> +                       ixgbe_ipsec_set_rx_sa(hw, i, rsa->xs->id.spi,
> +                                             rsa->key, rsa->salt,
> +                                             rsa->mode, rsa->iptbl_ind);
> +               else
> +                       ixgbe_ipsec_set_rx_sa(hw, i, 0, zbuf, 0, 0, 0);

same here

> +       }
> +
> +       /* reload the Tx keys */
> +       for (i = 0; i < IXGBE_IPSEC_MAX_SA_COUNT; i++) {
> +               struct tx_sa *tsa = &ipsec->tx_tbl[i];
> +
> +               if (tsa->used)
> +                       ixgbe_ipsec_set_tx_sa(hw, i, tsa->key, tsa->salt);
> +               else
> +                       ixgbe_ipsec_set_tx_sa(hw, i, zbuf, 0);

and here

> +       }
> +}
> +
> +/**
>   * ixgbe_ipsec_find_empty_idx - find the first unused security parameter index
>   * @ipsec: pointer to ipsec struct
>   * @rxtable: true if we need to look in the Rx table
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index 01fd89b..6eabf92 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -5347,6 +5347,7 @@ static void ixgbe_configure(struct ixgbe_adapter *adapter)
>
>         ixgbe_set_rx_mode(adapter->netdev);
>         ixgbe_restore_vlan(adapter);
> +       ixgbe_ipsec_restore(adapter);
>
>         switch (hw->mac.type) {
>         case ixgbe_mac_82599EB:
> --
> 2.7.4
>
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan@osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 06/10] ixgbe: restore offloaded SAs after a reset
@ 2017-12-05 17:30     ` Alexander Duyck
  0 siblings, 0 replies; 78+ messages in thread
From: Alexander Duyck @ 2017-12-05 17:30 UTC (permalink / raw)
  To: intel-wired-lan

On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
<shannon.nelson@oracle.com> wrote:
> On a chip reset most of the table contents are lost, so must be
> restored.  This scans the driver's ipsec tables and restores both
> the filled and empty table slots to their pre-reset values.
>
> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
> ---
>  drivers/net/ethernet/intel/ixgbe/ixgbe.h       |  2 +
>  drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 53 ++++++++++++++++++++++++++
>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  |  1 +
>  3 files changed, 56 insertions(+)
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> index 9487750..7e8bca7 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> @@ -1009,7 +1009,9 @@ s32 ixgbe_negotiate_fc(struct ixgbe_hw *hw, u32 adv_reg, u32 lp_reg,
>                        u32 adv_sym, u32 adv_asm, u32 lp_sym, u32 lp_asm);
>  #ifdef CONFIG_XFRM_OFFLOAD
>  void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter);
> +void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter);
>  #else
>  static inline void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter) { };
> +static inline void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter) { };
>  #endif /* CONFIG_XFRM_OFFLOAD */
>  #endif /* _IXGBE_H_ */
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> index 7b01d92..b93ee7f 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> @@ -292,6 +292,59 @@ static void ixgbe_ipsec_start_engine(struct ixgbe_adapter *adapter)
>  }
>
>  /**
> + * ixgbe_ipsec_restore - restore the ipsec HW settings after a reset
> + * @adapter: board private structure
> + **/
> +void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter)
> +{
> +       struct ixgbe_ipsec *ipsec = adapter->ipsec;
> +       struct ixgbe_hw *hw = &adapter->hw;
> +       u32 zbuf[4] = {0, 0, 0, 0};

zbuf should be a static const.

> +       int i;
> +
> +       if (!(adapter->flags2 & IXGBE_FLAG2_IPSEC_ENABLED))
> +               return;
> +
> +       /* clean up the engine settings */
> +       ixgbe_ipsec_stop_engine(adapter);
> +
> +       /* start the engine */
> +       ixgbe_ipsec_start_engine(adapter);
> +
> +       /* reload the IP addrs */
> +       for (i = 0; i < IXGBE_IPSEC_MAX_RX_IP_COUNT; i++) {
> +               struct rx_ip_sa *ipsa = &ipsec->ip_tbl[i];
> +
> +               if (ipsa->used)
> +                       ixgbe_ipsec_set_rx_ip(hw, i, ipsa->ipaddr);
> +               else
> +                       ixgbe_ipsec_set_rx_ip(hw, i, zbuf);

If we are doing a restore do we actually need to write the zero
values? If we did a reset I thought you had a function that was going
through and zeroing everything out. If so this now becomes redundant.

> +       }
> +
> +       /* reload the Rx keys */
> +       for (i = 0; i < IXGBE_IPSEC_MAX_SA_COUNT; i++) {
> +               struct rx_sa *rsa = &ipsec->rx_tbl[i];
> +
> +               if (rsa->used)
> +                       ixgbe_ipsec_set_rx_sa(hw, i, rsa->xs->id.spi,
> +                                             rsa->key, rsa->salt,
> +                                             rsa->mode, rsa->iptbl_ind);
> +               else
> +                       ixgbe_ipsec_set_rx_sa(hw, i, 0, zbuf, 0, 0, 0);

same here

> +       }
> +
> +       /* reload the Tx keys */
> +       for (i = 0; i < IXGBE_IPSEC_MAX_SA_COUNT; i++) {
> +               struct tx_sa *tsa = &ipsec->tx_tbl[i];
> +
> +               if (tsa->used)
> +                       ixgbe_ipsec_set_tx_sa(hw, i, tsa->key, tsa->salt);
> +               else
> +                       ixgbe_ipsec_set_tx_sa(hw, i, zbuf, 0);

and here

> +       }
> +}
> +
> +/**
>   * ixgbe_ipsec_find_empty_idx - find the first unused security parameter index
>   * @ipsec: pointer to ipsec struct
>   * @rxtable: true if we need to look in the Rx table
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index 01fd89b..6eabf92 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -5347,6 +5347,7 @@ static void ixgbe_configure(struct ixgbe_adapter *adapter)
>
>         ixgbe_set_rx_mode(adapter->netdev);
>         ixgbe_restore_vlan(adapter);
> +       ixgbe_ipsec_restore(adapter);
>
>         switch (hw->mac.type) {
>         case ixgbe_mac_82599EB:
> --
> 2.7.4
>
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan at osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [Intel-wired-lan] [next-queue 07/10] ixgbe: process the Rx ipsec offload
  2017-12-05  5:35   ` [Intel-wired-lan] " Shannon Nelson
@ 2017-12-05 17:40     ` Alexander Duyck
  -1 siblings, 0 replies; 78+ messages in thread
From: Alexander Duyck @ 2017-12-05 17:40 UTC (permalink / raw)
  To: Shannon Nelson
  Cc: intel-wired-lan, Jeff Kirsher, Steffen Klassert,
	Sowmini Varadhan, Netdev

On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
<shannon.nelson@oracle.com> wrote:
> If the chip sees and decrypts an ipsec offload, set up the skb
> sp pointer with the ralated SA info.  Since the chip is rude
> enough to keep to itself the table index it used for the
> decryption, we have to do our own table lookup, using the
> hash for speed.
>
> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
> ---
>  drivers/net/ethernet/intel/ixgbe/ixgbe.h       |  6 ++
>  drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 89 ++++++++++++++++++++++++++
>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  |  3 +
>  3 files changed, 98 insertions(+)
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> index 7e8bca7..77f07dc 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> @@ -1009,9 +1009,15 @@ s32 ixgbe_negotiate_fc(struct ixgbe_hw *hw, u32 adv_reg, u32 lp_reg,
>                        u32 adv_sym, u32 adv_asm, u32 lp_sym, u32 lp_asm);
>  #ifdef CONFIG_XFRM_OFFLOAD
>  void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter);
> +void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
> +                   union ixgbe_adv_rx_desc *rx_desc,
> +                   struct sk_buff *skb);
>  void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter);
>  #else
>  static inline void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter) { };
> +static inline void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
> +                                 union ixgbe_adv_rx_desc *rx_desc,
> +                                 struct sk_buff *skb) { };
>  static inline void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter) { };
>  #endif /* CONFIG_XFRM_OFFLOAD */
>  #endif /* _IXGBE_H_ */
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> index b93ee7f..fd06d9b 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> @@ -379,6 +379,35 @@ static int ixgbe_ipsec_find_empty_idx(struct ixgbe_ipsec *ipsec, bool rxtable)
>  }
>
>  /**
> + * ixgbe_ipsec_find_rx_state - find the state that matches
> + * @ipsec: pointer to ipsec struct
> + * @daddr: inbound address to match
> + * @proto: protocol to match
> + * @spi: SPI to match
> + *
> + * Returns a pointer to the matching SA state information
> + **/
> +static struct xfrm_state *ixgbe_ipsec_find_rx_state(struct ixgbe_ipsec *ipsec,
> +                                                   __be32 daddr, u8 proto,
> +                                                   __be32 spi)
> +{
> +       struct rx_sa *rsa;
> +       struct xfrm_state *ret = NULL;
> +
> +       rcu_read_lock();
> +       hash_for_each_possible_rcu(ipsec->rx_sa_list, rsa, hlist, spi)
> +               if (spi == rsa->xs->id.spi &&
> +                   daddr == rsa->xs->id.daddr.a4 &&
> +                   proto == rsa->xs->id.proto) {
> +                       ret = rsa->xs;
> +                       xfrm_state_hold(ret);
> +                       break;
> +               }
> +       rcu_read_unlock();
> +       return ret;
> +}
> +

You need to choose a bucket, not just walk through all buckets.
Otherwise you might as well have just used a linked list. You might
look at using something like jhash_3words to generate a hash which you
then use to choose the bucket.

> +/**
>   * ixgbe_ipsec_parse_proto_keys - find the key and salt based on the protocol
>   * @xs: pointer to xfrm_state struct
>   * @mykey: pointer to key array to populate
> @@ -680,6 +709,66 @@ static const struct xfrmdev_ops ixgbe_xfrmdev_ops = {
>  };
>
>  /**
> + * ixgbe_ipsec_rx - decode ipsec bits from Rx descriptor
> + * @rx_ring: receiving ring
> + * @rx_desc: receive data descriptor
> + * @skb: current data packet
> + *
> + * Determine if there was an ipsec encapsulation noticed, and if so set up
> + * the resulting status for later in the receive stack.
> + **/
> +void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
> +                   union ixgbe_adv_rx_desc *rx_desc,
> +                   struct sk_buff *skb)
> +{
> +       struct ixgbe_adapter *adapter = netdev_priv(rx_ring->netdev);
> +       u16 pkt_info = le16_to_cpu(rx_desc->wb.lower.lo_dword.hs_rss.pkt_info);
> +       u16 ipsec_pkt_types = IXGBE_RXDADV_PKTTYPE_IPSEC_AH |
> +                               IXGBE_RXDADV_PKTTYPE_IPSEC_ESP;
> +       struct ixgbe_ipsec *ipsec = adapter->ipsec;
> +       struct xfrm_offload *xo = NULL;
> +       struct xfrm_state *xs = NULL;
> +       struct iphdr *iph;
> +       u8 *c_hdr;
> +       __be32 spi;
> +       u8 proto;
> +
> +       /* we can assume no vlan header in the way, b/c the
> +        * hw won't recognize the IPsec packet and anyway the
> +        * currently vlan device doesn't support xfrm offload.
> +        */
> +       /* TODO: not supporting IPv6 yet */
> +       iph = (struct iphdr *)(skb->data + ETH_HLEN);
> +       c_hdr = (u8 *)iph + iph->ihl * 4;
> +       switch (pkt_info & ipsec_pkt_types) {
> +       case IXGBE_RXDADV_PKTTYPE_IPSEC_AH:
> +               spi = ((struct ip_auth_hdr *)c_hdr)->spi;
> +               proto = IPPROTO_AH;
> +               break;
> +       case IXGBE_RXDADV_PKTTYPE_IPSEC_ESP:
> +               spi = ((struct ip_esp_hdr *)c_hdr)->spi;
> +               proto = IPPROTO_ESP;
> +               break;
> +       default:
> +               return;
> +       }
> +
> +       xs = ixgbe_ipsec_find_rx_state(ipsec, iph->daddr, proto, spi);
> +       if (unlikely(!xs))
> +               return;
> +
> +       skb->sp = secpath_dup(skb->sp);
> +       if (unlikely(!skb->sp))
> +               return;
> +
> +       skb->sp->xvec[skb->sp->len++] = xs;
> +       skb->sp->olen++;
> +       xo = xfrm_offload(skb);
> +       xo->flags = CRYPTO_DONE;
> +       xo->status = CRYPTO_SUCCESS;
> +}
> +
> +/**
>   * ixgbe_init_ipsec_offload - initialize security registers for IPSec operation
>   * @adapter: board private structure
>   **/
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index 6eabf92..60f9f2d 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -1755,6 +1755,9 @@ static void ixgbe_process_skb_fields(struct ixgbe_ring *rx_ring,
>
>         skb_record_rx_queue(skb, rx_ring->queue_index);
>
> +       if (ixgbe_test_staterr(rx_desc, IXGBE_RXDADV_STAT_SECP))
> +               ixgbe_ipsec_rx(rx_ring, rx_desc, skb);
> +
>         skb->protocol = eth_type_trans(skb, dev);
>  }
>
> --
> 2.7.4
>
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan@osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 07/10] ixgbe: process the Rx ipsec offload
@ 2017-12-05 17:40     ` Alexander Duyck
  0 siblings, 0 replies; 78+ messages in thread
From: Alexander Duyck @ 2017-12-05 17:40 UTC (permalink / raw)
  To: intel-wired-lan

On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
<shannon.nelson@oracle.com> wrote:
> If the chip sees and decrypts an ipsec offload, set up the skb
> sp pointer with the ralated SA info.  Since the chip is rude
> enough to keep to itself the table index it used for the
> decryption, we have to do our own table lookup, using the
> hash for speed.
>
> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
> ---
>  drivers/net/ethernet/intel/ixgbe/ixgbe.h       |  6 ++
>  drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 89 ++++++++++++++++++++++++++
>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  |  3 +
>  3 files changed, 98 insertions(+)
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> index 7e8bca7..77f07dc 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> @@ -1009,9 +1009,15 @@ s32 ixgbe_negotiate_fc(struct ixgbe_hw *hw, u32 adv_reg, u32 lp_reg,
>                        u32 adv_sym, u32 adv_asm, u32 lp_sym, u32 lp_asm);
>  #ifdef CONFIG_XFRM_OFFLOAD
>  void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter);
> +void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
> +                   union ixgbe_adv_rx_desc *rx_desc,
> +                   struct sk_buff *skb);
>  void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter);
>  #else
>  static inline void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter) { };
> +static inline void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
> +                                 union ixgbe_adv_rx_desc *rx_desc,
> +                                 struct sk_buff *skb) { };
>  static inline void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter) { };
>  #endif /* CONFIG_XFRM_OFFLOAD */
>  #endif /* _IXGBE_H_ */
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> index b93ee7f..fd06d9b 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> @@ -379,6 +379,35 @@ static int ixgbe_ipsec_find_empty_idx(struct ixgbe_ipsec *ipsec, bool rxtable)
>  }
>
>  /**
> + * ixgbe_ipsec_find_rx_state - find the state that matches
> + * @ipsec: pointer to ipsec struct
> + * @daddr: inbound address to match
> + * @proto: protocol to match
> + * @spi: SPI to match
> + *
> + * Returns a pointer to the matching SA state information
> + **/
> +static struct xfrm_state *ixgbe_ipsec_find_rx_state(struct ixgbe_ipsec *ipsec,
> +                                                   __be32 daddr, u8 proto,
> +                                                   __be32 spi)
> +{
> +       struct rx_sa *rsa;
> +       struct xfrm_state *ret = NULL;
> +
> +       rcu_read_lock();
> +       hash_for_each_possible_rcu(ipsec->rx_sa_list, rsa, hlist, spi)
> +               if (spi == rsa->xs->id.spi &&
> +                   daddr == rsa->xs->id.daddr.a4 &&
> +                   proto == rsa->xs->id.proto) {
> +                       ret = rsa->xs;
> +                       xfrm_state_hold(ret);
> +                       break;
> +               }
> +       rcu_read_unlock();
> +       return ret;
> +}
> +

You need to choose a bucket, not just walk through all buckets.
Otherwise you might as well have just used a linked list. You might
look at using something like jhash_3words to generate a hash which you
then use to choose the bucket.

> +/**
>   * ixgbe_ipsec_parse_proto_keys - find the key and salt based on the protocol
>   * @xs: pointer to xfrm_state struct
>   * @mykey: pointer to key array to populate
> @@ -680,6 +709,66 @@ static const struct xfrmdev_ops ixgbe_xfrmdev_ops = {
>  };
>
>  /**
> + * ixgbe_ipsec_rx - decode ipsec bits from Rx descriptor
> + * @rx_ring: receiving ring
> + * @rx_desc: receive data descriptor
> + * @skb: current data packet
> + *
> + * Determine if there was an ipsec encapsulation noticed, and if so set up
> + * the resulting status for later in the receive stack.
> + **/
> +void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
> +                   union ixgbe_adv_rx_desc *rx_desc,
> +                   struct sk_buff *skb)
> +{
> +       struct ixgbe_adapter *adapter = netdev_priv(rx_ring->netdev);
> +       u16 pkt_info = le16_to_cpu(rx_desc->wb.lower.lo_dword.hs_rss.pkt_info);
> +       u16 ipsec_pkt_types = IXGBE_RXDADV_PKTTYPE_IPSEC_AH |
> +                               IXGBE_RXDADV_PKTTYPE_IPSEC_ESP;
> +       struct ixgbe_ipsec *ipsec = adapter->ipsec;
> +       struct xfrm_offload *xo = NULL;
> +       struct xfrm_state *xs = NULL;
> +       struct iphdr *iph;
> +       u8 *c_hdr;
> +       __be32 spi;
> +       u8 proto;
> +
> +       /* we can assume no vlan header in the way, b/c the
> +        * hw won't recognize the IPsec packet and anyway the
> +        * currently vlan device doesn't support xfrm offload.
> +        */
> +       /* TODO: not supporting IPv6 yet */
> +       iph = (struct iphdr *)(skb->data + ETH_HLEN);
> +       c_hdr = (u8 *)iph + iph->ihl * 4;
> +       switch (pkt_info & ipsec_pkt_types) {
> +       case IXGBE_RXDADV_PKTTYPE_IPSEC_AH:
> +               spi = ((struct ip_auth_hdr *)c_hdr)->spi;
> +               proto = IPPROTO_AH;
> +               break;
> +       case IXGBE_RXDADV_PKTTYPE_IPSEC_ESP:
> +               spi = ((struct ip_esp_hdr *)c_hdr)->spi;
> +               proto = IPPROTO_ESP;
> +               break;
> +       default:
> +               return;
> +       }
> +
> +       xs = ixgbe_ipsec_find_rx_state(ipsec, iph->daddr, proto, spi);
> +       if (unlikely(!xs))
> +               return;
> +
> +       skb->sp = secpath_dup(skb->sp);
> +       if (unlikely(!skb->sp))
> +               return;
> +
> +       skb->sp->xvec[skb->sp->len++] = xs;
> +       skb->sp->olen++;
> +       xo = xfrm_offload(skb);
> +       xo->flags = CRYPTO_DONE;
> +       xo->status = CRYPTO_SUCCESS;
> +}
> +
> +/**
>   * ixgbe_init_ipsec_offload - initialize security registers for IPSec operation
>   * @adapter: board private structure
>   **/
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index 6eabf92..60f9f2d 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -1755,6 +1755,9 @@ static void ixgbe_process_skb_fields(struct ixgbe_ring *rx_ring,
>
>         skb_record_rx_queue(skb, rx_ring->queue_index);
>
> +       if (ixgbe_test_staterr(rx_desc, IXGBE_RXDADV_STAT_SECP))
> +               ixgbe_ipsec_rx(rx_ring, rx_desc, skb);
> +
>         skb->protocol = eth_type_trans(skb, dev);
>  }
>
> --
> 2.7.4
>
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan at osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [Intel-wired-lan] [next-queue 08/10] ixgbe: process the Tx ipsec offload
  2017-12-05  5:35   ` [Intel-wired-lan] " Shannon Nelson
@ 2017-12-05 18:13     ` Alexander Duyck
  -1 siblings, 0 replies; 78+ messages in thread
From: Alexander Duyck @ 2017-12-05 18:13 UTC (permalink / raw)
  To: Shannon Nelson
  Cc: intel-wired-lan, Jeff Kirsher, Steffen Klassert,
	Sowmini Varadhan, Netdev

On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
<shannon.nelson@oracle.com> wrote:
> If the skb has a security association referenced in the skb, then
> set up the Tx descriptor with the ipsec offload bits.  While we're
> here, we fix an oddly named field in the context descriptor struct.
>
> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
> ---
>  drivers/net/ethernet/intel/ixgbe/ixgbe.h       | 10 +++-
>  drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 77 ++++++++++++++++++++++++++
>  drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c   |  4 +-
>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  | 38 ++++++++++---
>  drivers/net/ethernet/intel/ixgbe/ixgbe_type.h  |  2 +-
>  5 files changed, 118 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> index 77f07dc..68097fe 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> @@ -171,10 +171,11 @@ enum ixgbe_tx_flags {
>         IXGBE_TX_FLAGS_CC       = 0x08,
>         IXGBE_TX_FLAGS_IPV4     = 0x10,
>         IXGBE_TX_FLAGS_CSUM     = 0x20,
> +       IXGBE_TX_FLAGS_IPSEC    = 0x40,
>
>         /* software defined flags */
> -       IXGBE_TX_FLAGS_SW_VLAN  = 0x40,
> -       IXGBE_TX_FLAGS_FCOE     = 0x80,
> +       IXGBE_TX_FLAGS_SW_VLAN  = 0x80,
> +       IXGBE_TX_FLAGS_FCOE     = 0x100,
>  };
>
>  /* VLAN info */
> @@ -1012,12 +1013,17 @@ void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter);
>  void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
>                     union ixgbe_adv_rx_desc *rx_desc,
>                     struct sk_buff *skb);
> +int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring, struct sk_buff *skb,
> +                  __be16 protocol, struct ixgbe_ipsec_tx_data *itd);
>  void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter);
>  #else
>  static inline void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter) { };
>  static inline void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
>                                   union ixgbe_adv_rx_desc *rx_desc,
>                                   struct sk_buff *skb) { };
> +static inline int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring,
> +                                struct sk_buff *skb, __be16 protocol,
> +                                struct ixgbe_ipsec_tx_data *itd) { return 0; };
>  static inline void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter) { };
>  #endif /* CONFIG_XFRM_OFFLOAD */
>  #endif /* _IXGBE_H_ */
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> index fd06d9b..2a0dd7a 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> @@ -703,12 +703,89 @@ static void ixgbe_ipsec_del_sa(struct xfrm_state *xs)
>         }
>  }
>
> +/**
> + * ixgbe_ipsec_offload_ok - can this packet use the xfrm hw offload
> + * @skb: current data packet
> + * @xs: pointer to transformer state struct
> + **/
> +static bool ixgbe_ipsec_offload_ok(struct sk_buff *skb, struct xfrm_state *xs)
> +{
> +       if (xs->props.family == AF_INET) {
> +               /* Offload with IPv4 options is not supported yet */
> +               if (ip_hdr(skb)->ihl > 5)

I would make this ihl != 5 instead of "> 5" since smaller values would
be invalid as well.

> +                       return false;
> +       } else {
> +               /* Offload with IPv6 extension headers is not support yet */
> +               if (ipv6_ext_hdr(ipv6_hdr(skb)->nexthdr))
> +                       return false;
> +       }
> +
> +       return true;
> +}
> +
>  static const struct xfrmdev_ops ixgbe_xfrmdev_ops = {
>         .xdo_dev_state_add = ixgbe_ipsec_add_sa,
>         .xdo_dev_state_delete = ixgbe_ipsec_del_sa,
> +       .xdo_dev_offload_ok = ixgbe_ipsec_offload_ok,
>  };
>
>  /**
> + * ixgbe_ipsec_tx - setup Tx flags for ipsec offload
> + * @tx_ring: outgoing context
> + * @skb: current data packet
> + * @protocol: network protocol
> + * @itd: ipsec Tx data for later use in building context descriptor
> + **/
> +int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring, struct sk_buff *skb,
> +                  __be16 protocol, struct ixgbe_ipsec_tx_data *itd)
> +{
> +       struct ixgbe_adapter *adapter = netdev_priv(tx_ring->netdev);
> +       struct ixgbe_ipsec *ipsec = adapter->ipsec;
> +       struct xfrm_state *xs;
> +       struct tx_sa *tsa;
> +
> +       if (!skb->sp->len) {
> +               netdev_err(tx_ring->netdev, "%s: no xfrm state len = %d\n",
> +                          __func__, skb->sp->len);
> +               return 0;
> +       }
> +
> +       xs = xfrm_input_state(skb);
> +       if (!xs) {
> +               netdev_err(tx_ring->netdev, "%s: no xfrm_input_state() xs = %p\n",
> +                          __func__, xs);
> +               return 0;
> +       }
> +
> +       itd->sa_idx = xs->xso.offload_handle - IXGBE_IPSEC_BASE_TX_INDEX;
> +       if (itd->sa_idx > IXGBE_IPSEC_MAX_SA_COUNT) {
> +               netdev_err(tx_ring->netdev, "%s: bad sa_idx=%d handle=%lu\n",
> +                          __func__, itd->sa_idx, xs->xso.offload_handle);
> +               return 0;
> +       }
> +
> +       tsa = &ipsec->tx_tbl[itd->sa_idx];
> +       if (!tsa->used) {
> +               netdev_err(tx_ring->netdev, "%s: unused sa_idx=%d\n",
> +                          __func__, itd->sa_idx);
> +               return 0;
> +       }
> +
> +       itd->flags = 0;
> +       if (xs->id.proto == IPPROTO_ESP) {
> +               itd->flags |= IXGBE_ADVTXD_TUCMD_IPSEC_TYPE_ESP |
> +                             IXGBE_ADVTXD_TUCMD_L4T_TCP;

Why is the TCP value being set here? This doesn't seem correct either.
This implies TCP a TCP offload. It seems like this should only be
setting ESP.

> +               if (protocol == htons(ETH_P_IP))
> +                       itd->flags |= IXGBE_ADVTXD_TUCMD_IPV4;

Does the IPsec offload need to know if the frame is v4 or v6? I'm just
wondering if it does or not. If not then this probably isn't needed.
One thought on this line is you might look at moving it into
ixgbe_tx_csum. If setting the bit is harmless without setting IXSM we
might look at moving it into the end of ixgbe_tx_csum and just make it
compare against first->protocol there.

> +               itd->trailer_len = xs->props.trailer_len;
> +       }
> +       if (tsa->encrypt)
> +               itd->flags |= IXGBE_ADVTXD_TUCMD_IPSEC_ENCRYPT_EN;
> +
> +       return 1;
> +}
> +
> +/**
>   * ixgbe_ipsec_rx - decode ipsec bits from Rx descriptor
>   * @rx_ring: receiving ring
>   * @rx_desc: receive data descriptor
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
> index f1bfae0..d7875b3 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
> @@ -1261,7 +1261,7 @@ void ixgbe_clear_interrupt_scheme(struct ixgbe_adapter *adapter)
>  }
>
>  void ixgbe_tx_ctxtdesc(struct ixgbe_ring *tx_ring, u32 vlan_macip_lens,
> -                      u32 fcoe_sof_eof, u32 type_tucmd, u32 mss_l4len_idx)
> +                      u32 fceof_saidx, u32 type_tucmd, u32 mss_l4len_idx)
>  {
>         struct ixgbe_adv_tx_context_desc *context_desc;
>         u16 i = tx_ring->next_to_use;
> @@ -1275,7 +1275,7 @@ void ixgbe_tx_ctxtdesc(struct ixgbe_ring *tx_ring, u32 vlan_macip_lens,
>         type_tucmd |= IXGBE_TXD_CMD_DEXT | IXGBE_ADVTXD_DTYP_CTXT;
>
>         context_desc->vlan_macip_lens   = cpu_to_le32(vlan_macip_lens);
> -       context_desc->seqnum_seed       = cpu_to_le32(fcoe_sof_eof);
> +       context_desc->fceof_saidx       = cpu_to_le32(fceof_saidx);
>         context_desc->type_tucmd_mlhl   = cpu_to_le32(type_tucmd);
>         context_desc->mss_l4len_idx     = cpu_to_le32(mss_l4len_idx);
>  }
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index 60f9f2d..c857594 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -7659,9 +7659,10 @@ static void ixgbe_service_task(struct work_struct *work)
>
>  static int ixgbe_tso(struct ixgbe_ring *tx_ring,
>                      struct ixgbe_tx_buffer *first,
> -                    u8 *hdr_len)
> +                    u8 *hdr_len,
> +                    struct ixgbe_ipsec_tx_data *itd)
>  {
> -       u32 vlan_macip_lens, type_tucmd, mss_l4len_idx;
> +       u32 vlan_macip_lens, type_tucmd, mss_l4len_idx, fceof_saidx = 0;
>         struct sk_buff *skb = first->skb;
>         union {
>                 struct iphdr *v4;
> @@ -7740,7 +7741,12 @@ static int ixgbe_tso(struct ixgbe_ring *tx_ring,
>         vlan_macip_lens |= (ip.hdr - skb->data) << IXGBE_ADVTXD_MACLEN_SHIFT;
>         vlan_macip_lens |= first->tx_flags & IXGBE_TX_FLAGS_VLAN_MASK;
>
> -       ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, 0, type_tucmd,
> +       if (first->tx_flags & IXGBE_TX_FLAGS_IPSEC) {
> +               fceof_saidx |= itd->sa_idx;
> +               type_tucmd |= itd->flags | itd->trailer_len;
> +       }
> +
> +       ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, fceof_saidx, type_tucmd,
>                           mss_l4len_idx);
>
>         return 1;
> @@ -7756,10 +7762,12 @@ static inline bool ixgbe_ipv6_csum_is_sctp(struct sk_buff *skb)
>  }
>
>  static void ixgbe_tx_csum(struct ixgbe_ring *tx_ring,
> -                         struct ixgbe_tx_buffer *first)
> +                         struct ixgbe_tx_buffer *first,
> +                         struct ixgbe_ipsec_tx_data *itd)
>  {
>         struct sk_buff *skb = first->skb;
>         u32 vlan_macip_lens = 0;
> +       u32 fceof_saidx = 0;
>         u32 type_tucmd = 0;
>
>         if (skb->ip_summed != CHECKSUM_PARTIAL) {
> @@ -7800,7 +7808,12 @@ static void ixgbe_tx_csum(struct ixgbe_ring *tx_ring,
>         vlan_macip_lens |= skb_network_offset(skb) << IXGBE_ADVTXD_MACLEN_SHIFT;
>         vlan_macip_lens |= first->tx_flags & IXGBE_TX_FLAGS_VLAN_MASK;
>
> -       ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, 0, type_tucmd, 0);
> +       if (first->tx_flags & IXGBE_TX_FLAGS_IPSEC) {
> +               fceof_saidx |= itd->sa_idx;
> +               type_tucmd |= itd->flags | itd->trailer_len;
> +       }
> +
> +       ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, fceof_saidx, type_tucmd, 0);
>  }
>
>  #define IXGBE_SET_FLAG(_input, _flag, _result) \
> @@ -7843,11 +7856,16 @@ static void ixgbe_tx_olinfo_status(union ixgbe_adv_tx_desc *tx_desc,
>                                         IXGBE_TX_FLAGS_CSUM,
>                                         IXGBE_ADVTXD_POPTS_TXSM);
>
> -       /* enble IPv4 checksum for TSO */
> +       /* enable IPv4 checksum for TSO */
>         olinfo_status |= IXGBE_SET_FLAG(tx_flags,
>                                         IXGBE_TX_FLAGS_IPV4,
>                                         IXGBE_ADVTXD_POPTS_IXSM);
>
> +       /* enable IPsec */
> +       olinfo_status |= IXGBE_SET_FLAG(tx_flags,
> +                                       IXGBE_TX_FLAGS_IPSEC,
> +                                       IXGBE_ADVTXD_POPTS_IPSEC);
> +
>         /*
>          * Check Context must be set if Tx switch is enabled, which it
>          * always is for case where virtual functions are running
> @@ -8306,6 +8324,7 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff *skb,
>         u32 tx_flags = 0;
>         unsigned short f;
>         u16 count = TXD_USE_COUNT(skb_headlen(skb));
> +       struct ixgbe_ipsec_tx_data ipsec_tx = { 0 };
>         __be16 protocol = skb->protocol;
>         u8 hdr_len = 0;
>
> @@ -8394,6 +8413,9 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff *skb,
>                 }
>         }
>
> +       if (skb->sp && ixgbe_ipsec_tx(tx_ring, skb, protocol, &ipsec_tx))
> +               tx_flags |= IXGBE_TX_FLAGS_IPSEC | IXGBE_TX_FLAGS_CC;

You might just want to pull the skb->sp check into ixgbe_ipsec_tx and
could pass tx_flags as a part of the first buffer. It doesn't really
matter anyway as most of this will just be inlined so it will all end
up a part of the same function anyway.

Also I would move this down so that it is handled after the fields in
the first buffer_info structure are set. Then this can ll just fall
inline with the TSO block and get handled there.

> +
>         /* record initial flags and protocol */
>         first->tx_flags = tx_flags;
>         first->protocol = protocol;
> @@ -8410,11 +8432,11 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff *skb,
>         }
>
>  #endif /* IXGBE_FCOE */

So if you move the function down here it will help to avoid any other
complication. In addition you could follow the same logic that we do
for ixgbe_tso/fso so you could drop the frame instead of transmitting
it if it is requesting a bad offload.

> -       tso = ixgbe_tso(tx_ring, first, &hdr_len);
> +       tso = ixgbe_tso(tx_ring, first, &hdr_len, &ipsec_tx);
>         if (tso < 0)
>                 goto out_drop;
>         else if (!tso)
> -               ixgbe_tx_csum(tx_ring, first);
> +               ixgbe_tx_csum(tx_ring, first, &ipsec_tx);
>
>         /* add the ATR filter if ATR is on */
>         if (test_bit(__IXGBE_TX_FDIR_INIT_DONE, &tx_ring->state))
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
> index 3df0763..0ac725fa 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
> @@ -2856,7 +2856,7 @@ union ixgbe_adv_rx_desc {
>  /* Context descriptors */
>  struct ixgbe_adv_tx_context_desc {
>         __le32 vlan_macip_lens;
> -       __le32 seqnum_seed;
> +       __le32 fceof_saidx;
>         __le32 type_tucmd_mlhl;
>         __le32 mss_l4len_idx;
>  };
> --
> 2.7.4
>
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan@osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 08/10] ixgbe: process the Tx ipsec offload
@ 2017-12-05 18:13     ` Alexander Duyck
  0 siblings, 0 replies; 78+ messages in thread
From: Alexander Duyck @ 2017-12-05 18:13 UTC (permalink / raw)
  To: intel-wired-lan

On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
<shannon.nelson@oracle.com> wrote:
> If the skb has a security association referenced in the skb, then
> set up the Tx descriptor with the ipsec offload bits.  While we're
> here, we fix an oddly named field in the context descriptor struct.
>
> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
> ---
>  drivers/net/ethernet/intel/ixgbe/ixgbe.h       | 10 +++-
>  drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 77 ++++++++++++++++++++++++++
>  drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c   |  4 +-
>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  | 38 ++++++++++---
>  drivers/net/ethernet/intel/ixgbe/ixgbe_type.h  |  2 +-
>  5 files changed, 118 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> index 77f07dc..68097fe 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> @@ -171,10 +171,11 @@ enum ixgbe_tx_flags {
>         IXGBE_TX_FLAGS_CC       = 0x08,
>         IXGBE_TX_FLAGS_IPV4     = 0x10,
>         IXGBE_TX_FLAGS_CSUM     = 0x20,
> +       IXGBE_TX_FLAGS_IPSEC    = 0x40,
>
>         /* software defined flags */
> -       IXGBE_TX_FLAGS_SW_VLAN  = 0x40,
> -       IXGBE_TX_FLAGS_FCOE     = 0x80,
> +       IXGBE_TX_FLAGS_SW_VLAN  = 0x80,
> +       IXGBE_TX_FLAGS_FCOE     = 0x100,
>  };
>
>  /* VLAN info */
> @@ -1012,12 +1013,17 @@ void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter);
>  void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
>                     union ixgbe_adv_rx_desc *rx_desc,
>                     struct sk_buff *skb);
> +int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring, struct sk_buff *skb,
> +                  __be16 protocol, struct ixgbe_ipsec_tx_data *itd);
>  void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter);
>  #else
>  static inline void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter) { };
>  static inline void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
>                                   union ixgbe_adv_rx_desc *rx_desc,
>                                   struct sk_buff *skb) { };
> +static inline int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring,
> +                                struct sk_buff *skb, __be16 protocol,
> +                                struct ixgbe_ipsec_tx_data *itd) { return 0; };
>  static inline void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter) { };
>  #endif /* CONFIG_XFRM_OFFLOAD */
>  #endif /* _IXGBE_H_ */
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> index fd06d9b..2a0dd7a 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> @@ -703,12 +703,89 @@ static void ixgbe_ipsec_del_sa(struct xfrm_state *xs)
>         }
>  }
>
> +/**
> + * ixgbe_ipsec_offload_ok - can this packet use the xfrm hw offload
> + * @skb: current data packet
> + * @xs: pointer to transformer state struct
> + **/
> +static bool ixgbe_ipsec_offload_ok(struct sk_buff *skb, struct xfrm_state *xs)
> +{
> +       if (xs->props.family == AF_INET) {
> +               /* Offload with IPv4 options is not supported yet */
> +               if (ip_hdr(skb)->ihl > 5)

I would make this ihl != 5 instead of "> 5" since smaller values would
be invalid as well.

> +                       return false;
> +       } else {
> +               /* Offload with IPv6 extension headers is not support yet */
> +               if (ipv6_ext_hdr(ipv6_hdr(skb)->nexthdr))
> +                       return false;
> +       }
> +
> +       return true;
> +}
> +
>  static const struct xfrmdev_ops ixgbe_xfrmdev_ops = {
>         .xdo_dev_state_add = ixgbe_ipsec_add_sa,
>         .xdo_dev_state_delete = ixgbe_ipsec_del_sa,
> +       .xdo_dev_offload_ok = ixgbe_ipsec_offload_ok,
>  };
>
>  /**
> + * ixgbe_ipsec_tx - setup Tx flags for ipsec offload
> + * @tx_ring: outgoing context
> + * @skb: current data packet
> + * @protocol: network protocol
> + * @itd: ipsec Tx data for later use in building context descriptor
> + **/
> +int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring, struct sk_buff *skb,
> +                  __be16 protocol, struct ixgbe_ipsec_tx_data *itd)
> +{
> +       struct ixgbe_adapter *adapter = netdev_priv(tx_ring->netdev);
> +       struct ixgbe_ipsec *ipsec = adapter->ipsec;
> +       struct xfrm_state *xs;
> +       struct tx_sa *tsa;
> +
> +       if (!skb->sp->len) {
> +               netdev_err(tx_ring->netdev, "%s: no xfrm state len = %d\n",
> +                          __func__, skb->sp->len);
> +               return 0;
> +       }
> +
> +       xs = xfrm_input_state(skb);
> +       if (!xs) {
> +               netdev_err(tx_ring->netdev, "%s: no xfrm_input_state() xs = %p\n",
> +                          __func__, xs);
> +               return 0;
> +       }
> +
> +       itd->sa_idx = xs->xso.offload_handle - IXGBE_IPSEC_BASE_TX_INDEX;
> +       if (itd->sa_idx > IXGBE_IPSEC_MAX_SA_COUNT) {
> +               netdev_err(tx_ring->netdev, "%s: bad sa_idx=%d handle=%lu\n",
> +                          __func__, itd->sa_idx, xs->xso.offload_handle);
> +               return 0;
> +       }
> +
> +       tsa = &ipsec->tx_tbl[itd->sa_idx];
> +       if (!tsa->used) {
> +               netdev_err(tx_ring->netdev, "%s: unused sa_idx=%d\n",
> +                          __func__, itd->sa_idx);
> +               return 0;
> +       }
> +
> +       itd->flags = 0;
> +       if (xs->id.proto == IPPROTO_ESP) {
> +               itd->flags |= IXGBE_ADVTXD_TUCMD_IPSEC_TYPE_ESP |
> +                             IXGBE_ADVTXD_TUCMD_L4T_TCP;

Why is the TCP value being set here? This doesn't seem correct either.
This implies TCP a TCP offload. It seems like this should only be
setting ESP.

> +               if (protocol == htons(ETH_P_IP))
> +                       itd->flags |= IXGBE_ADVTXD_TUCMD_IPV4;

Does the IPsec offload need to know if the frame is v4 or v6? I'm just
wondering if it does or not. If not then this probably isn't needed.
One thought on this line is you might look at moving it into
ixgbe_tx_csum. If setting the bit is harmless without setting IXSM we
might look at moving it into the end of ixgbe_tx_csum and just make it
compare against first->protocol there.

> +               itd->trailer_len = xs->props.trailer_len;
> +       }
> +       if (tsa->encrypt)
> +               itd->flags |= IXGBE_ADVTXD_TUCMD_IPSEC_ENCRYPT_EN;
> +
> +       return 1;
> +}
> +
> +/**
>   * ixgbe_ipsec_rx - decode ipsec bits from Rx descriptor
>   * @rx_ring: receiving ring
>   * @rx_desc: receive data descriptor
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
> index f1bfae0..d7875b3 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
> @@ -1261,7 +1261,7 @@ void ixgbe_clear_interrupt_scheme(struct ixgbe_adapter *adapter)
>  }
>
>  void ixgbe_tx_ctxtdesc(struct ixgbe_ring *tx_ring, u32 vlan_macip_lens,
> -                      u32 fcoe_sof_eof, u32 type_tucmd, u32 mss_l4len_idx)
> +                      u32 fceof_saidx, u32 type_tucmd, u32 mss_l4len_idx)
>  {
>         struct ixgbe_adv_tx_context_desc *context_desc;
>         u16 i = tx_ring->next_to_use;
> @@ -1275,7 +1275,7 @@ void ixgbe_tx_ctxtdesc(struct ixgbe_ring *tx_ring, u32 vlan_macip_lens,
>         type_tucmd |= IXGBE_TXD_CMD_DEXT | IXGBE_ADVTXD_DTYP_CTXT;
>
>         context_desc->vlan_macip_lens   = cpu_to_le32(vlan_macip_lens);
> -       context_desc->seqnum_seed       = cpu_to_le32(fcoe_sof_eof);
> +       context_desc->fceof_saidx       = cpu_to_le32(fceof_saidx);
>         context_desc->type_tucmd_mlhl   = cpu_to_le32(type_tucmd);
>         context_desc->mss_l4len_idx     = cpu_to_le32(mss_l4len_idx);
>  }
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index 60f9f2d..c857594 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -7659,9 +7659,10 @@ static void ixgbe_service_task(struct work_struct *work)
>
>  static int ixgbe_tso(struct ixgbe_ring *tx_ring,
>                      struct ixgbe_tx_buffer *first,
> -                    u8 *hdr_len)
> +                    u8 *hdr_len,
> +                    struct ixgbe_ipsec_tx_data *itd)
>  {
> -       u32 vlan_macip_lens, type_tucmd, mss_l4len_idx;
> +       u32 vlan_macip_lens, type_tucmd, mss_l4len_idx, fceof_saidx = 0;
>         struct sk_buff *skb = first->skb;
>         union {
>                 struct iphdr *v4;
> @@ -7740,7 +7741,12 @@ static int ixgbe_tso(struct ixgbe_ring *tx_ring,
>         vlan_macip_lens |= (ip.hdr - skb->data) << IXGBE_ADVTXD_MACLEN_SHIFT;
>         vlan_macip_lens |= first->tx_flags & IXGBE_TX_FLAGS_VLAN_MASK;
>
> -       ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, 0, type_tucmd,
> +       if (first->tx_flags & IXGBE_TX_FLAGS_IPSEC) {
> +               fceof_saidx |= itd->sa_idx;
> +               type_tucmd |= itd->flags | itd->trailer_len;
> +       }
> +
> +       ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, fceof_saidx, type_tucmd,
>                           mss_l4len_idx);
>
>         return 1;
> @@ -7756,10 +7762,12 @@ static inline bool ixgbe_ipv6_csum_is_sctp(struct sk_buff *skb)
>  }
>
>  static void ixgbe_tx_csum(struct ixgbe_ring *tx_ring,
> -                         struct ixgbe_tx_buffer *first)
> +                         struct ixgbe_tx_buffer *first,
> +                         struct ixgbe_ipsec_tx_data *itd)
>  {
>         struct sk_buff *skb = first->skb;
>         u32 vlan_macip_lens = 0;
> +       u32 fceof_saidx = 0;
>         u32 type_tucmd = 0;
>
>         if (skb->ip_summed != CHECKSUM_PARTIAL) {
> @@ -7800,7 +7808,12 @@ static void ixgbe_tx_csum(struct ixgbe_ring *tx_ring,
>         vlan_macip_lens |= skb_network_offset(skb) << IXGBE_ADVTXD_MACLEN_SHIFT;
>         vlan_macip_lens |= first->tx_flags & IXGBE_TX_FLAGS_VLAN_MASK;
>
> -       ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, 0, type_tucmd, 0);
> +       if (first->tx_flags & IXGBE_TX_FLAGS_IPSEC) {
> +               fceof_saidx |= itd->sa_idx;
> +               type_tucmd |= itd->flags | itd->trailer_len;
> +       }
> +
> +       ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, fceof_saidx, type_tucmd, 0);
>  }
>
>  #define IXGBE_SET_FLAG(_input, _flag, _result) \
> @@ -7843,11 +7856,16 @@ static void ixgbe_tx_olinfo_status(union ixgbe_adv_tx_desc *tx_desc,
>                                         IXGBE_TX_FLAGS_CSUM,
>                                         IXGBE_ADVTXD_POPTS_TXSM);
>
> -       /* enble IPv4 checksum for TSO */
> +       /* enable IPv4 checksum for TSO */
>         olinfo_status |= IXGBE_SET_FLAG(tx_flags,
>                                         IXGBE_TX_FLAGS_IPV4,
>                                         IXGBE_ADVTXD_POPTS_IXSM);
>
> +       /* enable IPsec */
> +       olinfo_status |= IXGBE_SET_FLAG(tx_flags,
> +                                       IXGBE_TX_FLAGS_IPSEC,
> +                                       IXGBE_ADVTXD_POPTS_IPSEC);
> +
>         /*
>          * Check Context must be set if Tx switch is enabled, which it
>          * always is for case where virtual functions are running
> @@ -8306,6 +8324,7 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff *skb,
>         u32 tx_flags = 0;
>         unsigned short f;
>         u16 count = TXD_USE_COUNT(skb_headlen(skb));
> +       struct ixgbe_ipsec_tx_data ipsec_tx = { 0 };
>         __be16 protocol = skb->protocol;
>         u8 hdr_len = 0;
>
> @@ -8394,6 +8413,9 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff *skb,
>                 }
>         }
>
> +       if (skb->sp && ixgbe_ipsec_tx(tx_ring, skb, protocol, &ipsec_tx))
> +               tx_flags |= IXGBE_TX_FLAGS_IPSEC | IXGBE_TX_FLAGS_CC;

You might just want to pull the skb->sp check into ixgbe_ipsec_tx and
could pass tx_flags as a part of the first buffer. It doesn't really
matter anyway as most of this will just be inlined so it will all end
up a part of the same function anyway.

Also I would move this down so that it is handled after the fields in
the first buffer_info structure are set. Then this can ll just fall
inline with the TSO block and get handled there.

> +
>         /* record initial flags and protocol */
>         first->tx_flags = tx_flags;
>         first->protocol = protocol;
> @@ -8410,11 +8432,11 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff *skb,
>         }
>
>  #endif /* IXGBE_FCOE */

So if you move the function down here it will help to avoid any other
complication. In addition you could follow the same logic that we do
for ixgbe_tso/fso so you could drop the frame instead of transmitting
it if it is requesting a bad offload.

> -       tso = ixgbe_tso(tx_ring, first, &hdr_len);
> +       tso = ixgbe_tso(tx_ring, first, &hdr_len, &ipsec_tx);
>         if (tso < 0)
>                 goto out_drop;
>         else if (!tso)
> -               ixgbe_tx_csum(tx_ring, first);
> +               ixgbe_tx_csum(tx_ring, first, &ipsec_tx);
>
>         /* add the ATR filter if ATR is on */
>         if (test_bit(__IXGBE_TX_FDIR_INIT_DONE, &tx_ring->state))
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
> index 3df0763..0ac725fa 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
> @@ -2856,7 +2856,7 @@ union ixgbe_adv_rx_desc {
>  /* Context descriptors */
>  struct ixgbe_adv_tx_context_desc {
>         __le32 vlan_macip_lens;
> -       __le32 seqnum_seed;
> +       __le32 fceof_saidx;
>         __le32 type_tucmd_mlhl;
>         __le32 mss_l4len_idx;
>  };
> --
> 2.7.4
>
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan at osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [Intel-wired-lan] [next-queue 09/10] ixgbe: ipsec offload stats
  2017-12-05  5:35   ` [Intel-wired-lan] " Shannon Nelson
@ 2017-12-05 19:53     ` Alexander Duyck
  -1 siblings, 0 replies; 78+ messages in thread
From: Alexander Duyck @ 2017-12-05 19:53 UTC (permalink / raw)
  To: Shannon Nelson
  Cc: intel-wired-lan, Jeff Kirsher, Steffen Klassert,
	Sowmini Varadhan, Netdev

On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
<shannon.nelson@oracle.com> wrote:
> Add a simple statistic to count the ipsec offloads.
>
> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
> ---
>  drivers/net/ethernet/intel/ixgbe/ixgbe.h         |  1 +
>  drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c | 28 ++++++++++++++----------
>  drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c   |  3 +++
>  3 files changed, 20 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> index 68097fe..bb66c85 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> @@ -265,6 +265,7 @@ struct ixgbe_rx_buffer {
>  struct ixgbe_queue_stats {
>         u64 packets;
>         u64 bytes;
> +       u64 ipsec_offloads;
>  };
>
>  struct ixgbe_tx_queue_stats {
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
> index c3e7a81..dddbc74 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
> @@ -1233,34 +1233,34 @@ static void ixgbe_get_ethtool_stats(struct net_device *netdev,
>         for (j = 0; j < netdev->num_tx_queues; j++) {
>                 ring = adapter->tx_ring[j];
>                 if (!ring) {
> -                       data[i] = 0;
> -                       data[i+1] = 0;
> -                       i += 2;
> +                       data[i++] = 0;
> +                       data[i++] = 0;
> +                       data[i++] = 0;
>                         continue;
>                 }
>
>                 do {
>                         start = u64_stats_fetch_begin_irq(&ring->syncp);
> -                       data[i]   = ring->stats.packets;
> -                       data[i+1] = ring->stats.bytes;
> +                       data[i++] = ring->stats.packets;
> +                       data[i++] = ring->stats.bytes;
> +                       data[i++] = ring->stats.ipsec_offloads;
>                 } while (u64_stats_fetch_retry_irq(&ring->syncp, start));
> -               i += 2;
>         }
>         for (j = 0; j < IXGBE_NUM_RX_QUEUES; j++) {
>                 ring = adapter->rx_ring[j];
>                 if (!ring) {
> -                       data[i] = 0;
> -                       data[i+1] = 0;
> -                       i += 2;
> +                       data[i++] = 0;
> +                       data[i++] = 0;
> +                       data[i++] = 0;
>                         continue;
>                 }
>
>                 do {
>                         start = u64_stats_fetch_begin_irq(&ring->syncp);
> -                       data[i]   = ring->stats.packets;
> -                       data[i+1] = ring->stats.bytes;
> +                       data[i++] = ring->stats.packets;
> +                       data[i++] = ring->stats.bytes;
> +                       data[i++] = ring->stats.ipsec_offloads;
>                 } while (u64_stats_fetch_retry_irq(&ring->syncp, start));
> -               i += 2;
>         }
>
>         for (j = 0; j < IXGBE_MAX_PACKET_BUFFERS; j++) {
> @@ -1297,12 +1297,16 @@ static void ixgbe_get_strings(struct net_device *netdev, u32 stringset,
>                         p += ETH_GSTRING_LEN;
>                         sprintf(p, "tx_queue_%u_bytes", i);
>                         p += ETH_GSTRING_LEN;
> +                       sprintf(p, "tx_queue_%u_ipsec_offloads", i);
> +                       p += ETH_GSTRING_LEN;
>                 }
>                 for (i = 0; i < IXGBE_NUM_RX_QUEUES; i++) {
>                         sprintf(p, "rx_queue_%u_packets", i);
>                         p += ETH_GSTRING_LEN;
>                         sprintf(p, "rx_queue_%u_bytes", i);
>                         p += ETH_GSTRING_LEN;
> +                       sprintf(p, "rx_queue_%u_ipsec_offloads", i);
> +                       p += ETH_GSTRING_LEN;
>                 }
>                 for (i = 0; i < IXGBE_MAX_PACKET_BUFFERS; i++) {
>                         sprintf(p, "tx_pb_%u_pxon", i);

I probably wouldn't bother reporting this per ring. It might make more
sense to handle this as an adapter statistic.

> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> index 2a0dd7a..d1220bf 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> @@ -782,6 +782,7 @@ int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring, struct sk_buff *skb,
>         if (tsa->encrypt)
>                 itd->flags |= IXGBE_ADVTXD_TUCMD_IPSEC_ENCRYPT_EN;
>
> +       tx_ring->stats.ipsec_offloads++;
>         return 1;

Instead of doing this here you may want to make it a part of the Tx
clean-up path. You should still have the flag bit set so you could
test a test for the IPSEC flag bit and if it is set on the tx_buffer
following the transmit you could then increment it there.

>  }
>
> @@ -843,6 +844,8 @@ void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
>         xo = xfrm_offload(skb);
>         xo->flags = CRYPTO_DONE;
>         xo->status = CRYPTO_SUCCESS;
> +
> +       rx_ring->stats.ipsec_offloads++;
>  }
>
>  /**
> --
> 2.7.4
>
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan@osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 09/10] ixgbe: ipsec offload stats
@ 2017-12-05 19:53     ` Alexander Duyck
  0 siblings, 0 replies; 78+ messages in thread
From: Alexander Duyck @ 2017-12-05 19:53 UTC (permalink / raw)
  To: intel-wired-lan

On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
<shannon.nelson@oracle.com> wrote:
> Add a simple statistic to count the ipsec offloads.
>
> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
> ---
>  drivers/net/ethernet/intel/ixgbe/ixgbe.h         |  1 +
>  drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c | 28 ++++++++++++++----------
>  drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c   |  3 +++
>  3 files changed, 20 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> index 68097fe..bb66c85 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> @@ -265,6 +265,7 @@ struct ixgbe_rx_buffer {
>  struct ixgbe_queue_stats {
>         u64 packets;
>         u64 bytes;
> +       u64 ipsec_offloads;
>  };
>
>  struct ixgbe_tx_queue_stats {
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
> index c3e7a81..dddbc74 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
> @@ -1233,34 +1233,34 @@ static void ixgbe_get_ethtool_stats(struct net_device *netdev,
>         for (j = 0; j < netdev->num_tx_queues; j++) {
>                 ring = adapter->tx_ring[j];
>                 if (!ring) {
> -                       data[i] = 0;
> -                       data[i+1] = 0;
> -                       i += 2;
> +                       data[i++] = 0;
> +                       data[i++] = 0;
> +                       data[i++] = 0;
>                         continue;
>                 }
>
>                 do {
>                         start = u64_stats_fetch_begin_irq(&ring->syncp);
> -                       data[i]   = ring->stats.packets;
> -                       data[i+1] = ring->stats.bytes;
> +                       data[i++] = ring->stats.packets;
> +                       data[i++] = ring->stats.bytes;
> +                       data[i++] = ring->stats.ipsec_offloads;
>                 } while (u64_stats_fetch_retry_irq(&ring->syncp, start));
> -               i += 2;
>         }
>         for (j = 0; j < IXGBE_NUM_RX_QUEUES; j++) {
>                 ring = adapter->rx_ring[j];
>                 if (!ring) {
> -                       data[i] = 0;
> -                       data[i+1] = 0;
> -                       i += 2;
> +                       data[i++] = 0;
> +                       data[i++] = 0;
> +                       data[i++] = 0;
>                         continue;
>                 }
>
>                 do {
>                         start = u64_stats_fetch_begin_irq(&ring->syncp);
> -                       data[i]   = ring->stats.packets;
> -                       data[i+1] = ring->stats.bytes;
> +                       data[i++] = ring->stats.packets;
> +                       data[i++] = ring->stats.bytes;
> +                       data[i++] = ring->stats.ipsec_offloads;
>                 } while (u64_stats_fetch_retry_irq(&ring->syncp, start));
> -               i += 2;
>         }
>
>         for (j = 0; j < IXGBE_MAX_PACKET_BUFFERS; j++) {
> @@ -1297,12 +1297,16 @@ static void ixgbe_get_strings(struct net_device *netdev, u32 stringset,
>                         p += ETH_GSTRING_LEN;
>                         sprintf(p, "tx_queue_%u_bytes", i);
>                         p += ETH_GSTRING_LEN;
> +                       sprintf(p, "tx_queue_%u_ipsec_offloads", i);
> +                       p += ETH_GSTRING_LEN;
>                 }
>                 for (i = 0; i < IXGBE_NUM_RX_QUEUES; i++) {
>                         sprintf(p, "rx_queue_%u_packets", i);
>                         p += ETH_GSTRING_LEN;
>                         sprintf(p, "rx_queue_%u_bytes", i);
>                         p += ETH_GSTRING_LEN;
> +                       sprintf(p, "rx_queue_%u_ipsec_offloads", i);
> +                       p += ETH_GSTRING_LEN;
>                 }
>                 for (i = 0; i < IXGBE_MAX_PACKET_BUFFERS; i++) {
>                         sprintf(p, "tx_pb_%u_pxon", i);

I probably wouldn't bother reporting this per ring. It might make more
sense to handle this as an adapter statistic.

> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> index 2a0dd7a..d1220bf 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> @@ -782,6 +782,7 @@ int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring, struct sk_buff *skb,
>         if (tsa->encrypt)
>                 itd->flags |= IXGBE_ADVTXD_TUCMD_IPSEC_ENCRYPT_EN;
>
> +       tx_ring->stats.ipsec_offloads++;
>         return 1;

Instead of doing this here you may want to make it a part of the Tx
clean-up path. You should still have the flag bit set so you could
test a test for the IPSEC flag bit and if it is set on the tx_buffer
following the transmit you could then increment it there.

>  }
>
> @@ -843,6 +844,8 @@ void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
>         xo = xfrm_offload(skb);
>         xo->flags = CRYPTO_DONE;
>         xo->status = CRYPTO_SUCCESS;
> +
> +       rx_ring->stats.ipsec_offloads++;
>  }
>
>  /**
> --
> 2.7.4
>
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan at osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [Intel-wired-lan] [next-queue 10/10] ixgbe: register ipsec offload with the xfrm subsystem
  2017-12-05  5:35   ` [Intel-wired-lan] " Shannon Nelson
@ 2017-12-05 20:11     ` Alexander Duyck
  -1 siblings, 0 replies; 78+ messages in thread
From: Alexander Duyck @ 2017-12-05 20:11 UTC (permalink / raw)
  To: Shannon Nelson
  Cc: intel-wired-lan, Jeff Kirsher, Steffen Klassert,
	Sowmini Varadhan, Netdev

On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
<shannon.nelson@oracle.com> wrote:
> With all the support code in place we can now link in the ipsec
> offload operations and set the ESP feature flag for the XFRM
> subsystem to see.
>
> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
> ---
>  drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 4 ++++
>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  | 4 ++++
>  2 files changed, 8 insertions(+)
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> index d1220bf..0d5497b 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> @@ -884,6 +884,10 @@ void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter)
>         ixgbe_ipsec_clear_hw_tables(adapter);
>         ixgbe_ipsec_stop_engine(adapter);
>
> +       adapter->netdev->xfrmdev_ops = &ixgbe_xfrmdev_ops;
> +       adapter->netdev->features |= NETIF_F_HW_ESP;
> +       adapter->netdev->hw_enc_features |= NETIF_F_HW_ESP;
> +
>         return;
>  err:
>         if (ipsec) {
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index c857594..9231351 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -9799,6 +9799,10 @@ ixgbe_features_check(struct sk_buff *skb, struct net_device *dev,
>         if (skb->encapsulation && !(features & NETIF_F_TSO_MANGLEID))
>                 features &= ~NETIF_F_TSO;
>
> +       /* IPsec offload doesn't get along well with others *yet* */
> +       if (skb->sp)
> +               features &= ~(NETIF_F_TSO | NETIF_F_HW_CSUM_BIT);

I'm pretty sure the feature flag stripping here isn't correct. The
feature bits you want to strip would probably be consistent with the
network_hdr_len check bits included before the MANGLEID check.

We should do some digging into this as it may be a kernel issue. I'm
just wondering if ipsec updates any headers such as the transport
offset or skb checksum start. If either of those are updated that
would explain the issues with getting the offloads to work.

> +
>         return features;
>  }
>
> --
> 2.7.4
>
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan@osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 10/10] ixgbe: register ipsec offload with the xfrm subsystem
@ 2017-12-05 20:11     ` Alexander Duyck
  0 siblings, 0 replies; 78+ messages in thread
From: Alexander Duyck @ 2017-12-05 20:11 UTC (permalink / raw)
  To: intel-wired-lan

On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
<shannon.nelson@oracle.com> wrote:
> With all the support code in place we can now link in the ipsec
> offload operations and set the ESP feature flag for the XFRM
> subsystem to see.
>
> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
> ---
>  drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 4 ++++
>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  | 4 ++++
>  2 files changed, 8 insertions(+)
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> index d1220bf..0d5497b 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
> @@ -884,6 +884,10 @@ void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter)
>         ixgbe_ipsec_clear_hw_tables(adapter);
>         ixgbe_ipsec_stop_engine(adapter);
>
> +       adapter->netdev->xfrmdev_ops = &ixgbe_xfrmdev_ops;
> +       adapter->netdev->features |= NETIF_F_HW_ESP;
> +       adapter->netdev->hw_enc_features |= NETIF_F_HW_ESP;
> +
>         return;
>  err:
>         if (ipsec) {
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index c857594..9231351 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -9799,6 +9799,10 @@ ixgbe_features_check(struct sk_buff *skb, struct net_device *dev,
>         if (skb->encapsulation && !(features & NETIF_F_TSO_MANGLEID))
>                 features &= ~NETIF_F_TSO;
>
> +       /* IPsec offload doesn't get along well with others *yet* */
> +       if (skb->sp)
> +               features &= ~(NETIF_F_TSO | NETIF_F_HW_CSUM_BIT);

I'm pretty sure the feature flag stripping here isn't correct. The
feature bits you want to strip would probably be consistent with the
network_hdr_len check bits included before the MANGLEID check.

We should do some digging into this as it may be a kernel issue. I'm
just wondering if ipsec updates any headers such as the transport
offset or skb checksum start. If either of those are updated that
would explain the issues with getting the offloads to work.

> +
>         return features;
>  }
>
> --
> 2.7.4
>
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan at osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [Intel-wired-lan] [next-queue 02/10] ixgbe: add ipsec register access routines
  2017-12-05 16:56     ` Alexander Duyck
@ 2017-12-07  5:43       ` Shannon Nelson
  -1 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-07  5:43 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: intel-wired-lan, Jeff Kirsher, Steffen Klassert,
	Sowmini Varadhan, Netdev

Thanks, Alex, for your detailed comments, I do appreciate the time and 
thought you put into them.

Responses below...

sln

On 12/5/2017 8:56 AM, Alexander Duyck wrote:
> On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
> <shannon.nelson@oracle.com> wrote:
>> Add a few routines to make access to the ipsec registers just a little
>> easier, and throw in the beginnings of an initialization.
>>
>> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
>> ---
>>   drivers/net/ethernet/intel/ixgbe/Makefile      |   1 +
>>   drivers/net/ethernet/intel/ixgbe/ixgbe.h       |   6 +
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 157 +++++++++++++++++++++++++
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h |  50 ++++++++
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  |   1 +
>>   5 files changed, 215 insertions(+)
>>   create mode 100644 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>   create mode 100644 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
>>
>> diff --git a/drivers/net/ethernet/intel/ixgbe/Makefile b/drivers/net/ethernet/intel/ixgbe/Makefile
>> index 35e6fa6..8319465 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/Makefile
>> +++ b/drivers/net/ethernet/intel/ixgbe/Makefile
>> @@ -42,3 +42,4 @@ ixgbe-$(CONFIG_IXGBE_DCB) +=  ixgbe_dcb.o ixgbe_dcb_82598.o \
>>   ixgbe-$(CONFIG_IXGBE_HWMON) += ixgbe_sysfs.o
>>   ixgbe-$(CONFIG_DEBUG_FS) += ixgbe_debugfs.o
>>   ixgbe-$(CONFIG_FCOE:m=y) += ixgbe_fcoe.o
>> +ixgbe-$(CONFIG_XFRM_OFFLOAD) += ixgbe_ipsec.o
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> index dd55787..1e11462 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> @@ -52,6 +52,7 @@
>>   #ifdef CONFIG_IXGBE_DCA
>>   #include <linux/dca.h>
>>   #endif
>> +#include "ixgbe_ipsec.h"
>>
>>   #include <net/busy_poll.h>
>>
>> @@ -1001,4 +1002,9 @@ void ixgbe_store_key(struct ixgbe_adapter *adapter);
>>   void ixgbe_store_reta(struct ixgbe_adapter *adapter);
>>   s32 ixgbe_negotiate_fc(struct ixgbe_hw *hw, u32 adv_reg, u32 lp_reg,
>>                         u32 adv_sym, u32 adv_asm, u32 lp_sym, u32 lp_asm);
>> +#ifdef CONFIG_XFRM_OFFLOAD
>> +void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter);
>> +#else
>> +static inline void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter) { };
>> +#endif /* CONFIG_XFRM_OFFLOAD */
>>   #endif /* _IXGBE_H_ */
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> new file mode 100644
>> index 0000000..14dd011
>> --- /dev/null
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> @@ -0,0 +1,157 @@
>> +/*******************************************************************************
>> + *
>> + * Intel 10 Gigabit PCI Express Linux driver
>> + * Copyright(c) 2017 Oracle and/or its affiliates. All rights reserved.
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms and conditions of the GNU General Public License,
>> + * version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope it will be useful, but WITHOUT
>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>> + * more details.
>> + *
>> + * You should have received a copy of the GNU General Public License along with
>> + * this program.  If not, see <http://www.gnu.org/licenses/>.
>> + *
>> + * The full GNU General Public License is included in this distribution in
>> + * the file called "COPYING".
>> + *
>> + * Contact Information:
>> + * Linux NICS <linux.nics@intel.com>
>> + * e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
>> + * Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
>> + *
>> + ******************************************************************************/
>> +
>> +#include "ixgbe.h"
>> +
>> +/**
>> + * ixgbe_ipsec_set_tx_sa - set the Tx SA registers
>> + * @hw: hw specific details
>> + * @idx: register index to write
>> + * @key: key byte array
>> + * @salt: salt bytes
>> + **/
>> +static void ixgbe_ipsec_set_tx_sa(struct ixgbe_hw *hw, u16 idx,
>> +                                 u32 key[], u32 salt)
>> +{
>> +       u32 reg;
>> +       int i;
>> +
>> +       for (i = 0; i < 4; i++)
>> +               IXGBE_WRITE_REG(hw, IXGBE_IPSTXKEY(i), cpu_to_be32(key[3-i]));
>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSTXSALT, cpu_to_be32(salt));
>> +       IXGBE_WRITE_FLUSH(hw);
>> +
>> +       reg = IXGBE_READ_REG(hw, IXGBE_IPSTXIDX);
>> +       reg &= IXGBE_RXTXIDX_IPS_EN;
>> +       reg |= idx << 3 | IXGBE_RXTXIDX_IDX_WRITE;
>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSTXIDX, reg);
>> +       IXGBE_WRITE_FLUSH(hw);
>> +}
>> +
> 
> So there are a few things here to unpack.
> 
> The first is the carry-forward of the IPS bit. I'm not sure that is
> the best way to go. Do we really expect to be updating SA values if
> IPsec offload is not enabled?

In order to save on energy, we don't enable the engine until we have the 
first SA successfully stored in the tables, so the enable bit will be 
off for that one.

Also, the datasheet specifically says for the Rx table "Software should 
not make changes in the Rx SA tables while changing the IPSEC_EN bit." 
I figured I'd use the same method on both tables for consistency.

> If so we may just want to carry a bit
> flag somewhere in the ixgbe_hw struct indicating if Tx IPsec offload
> is enabled and use that to determine the value for this bit.
> 
> Also we should probably replace "3" with a value indicating that it is
> the SA index shift.

Sure, that would be good.

> 
> Also technically the WRITE_FLUSH isn't needed if you are doing a PCIe
> read anyway to get IPSTXIDX.

That's from having to be very fastidious about these 
reads/writes/flushes before the engine actually worked for me.  I could 
spend time taking them out and testing each change again, but they 
aren't in a fast path, so I'm really not worried about it.

> 
>> +/**
>> + * ixgbe_ipsec_set_rx_item - set an Rx table item
>> + * @hw: hw specific details
>> + * @idx: register index to write
>> + * @tbl: table selector
>> + *
>> + * Trigger the device to store into a particular Rx table the
>> + * data that has already been loaded into the input register
>> + **/
>> +static void ixgbe_ipsec_set_rx_item(struct ixgbe_hw *hw, u16 idx, u32 tbl)
>> +{
>> +       u32 reg;
>> +
>> +       reg = IXGBE_READ_REG(hw, IXGBE_IPSRXIDX);
>> +       reg &= IXGBE_RXTXIDX_IPS_EN;
>> +       reg |= tbl | idx << 3 | IXGBE_RXTXIDX_IDX_WRITE;
>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXIDX, reg);
>> +       IXGBE_WRITE_FLUSH(hw);
>> +}
>> +
> 
> The Rx version of this gets a bit trickier since the datasheet
> actually indicates there are a few different types of tables that can
> be indexed via this. Also why is the tbl value not being shifted? It
> seems like it should be shifted by 1 to avoid overwriting the IPS_EN
> bit. Really I would like to see the tbl value converted to an enum and
> shifted by 1 in order to generate the table reference.

I would have done this, but we can't use an enum shifted bit because the 
field values are 01, 10, and 11.  I used the direct 2, 4, and 6 values 
rather than shifting by one, but I can reset them and shift by 1.

> 
> Here the "3" is a table index. It might be nice to call that out with
> a name instead of using the magic number.

Yep

> 
>> +/**
>> + * ixgbe_ipsec_set_rx_sa - set up the register bits to save SA info
>> + * @hw: hw specific details
>> + * @idx: register index to write
>> + * @spi: security parameter index
>> + * @key: key byte array
>> + * @salt: salt bytes
>> + * @mode: rx decrypt control bits
>> + * @ip_idx: index into IP table for related IP address
>> + **/
>> +static void ixgbe_ipsec_set_rx_sa(struct ixgbe_hw *hw, u16 idx, __be32 spi,
>> +                                 u32 key[], u32 salt, u32 mode, u32 ip_idx)
>> +{
>> +       int i;
>> +
>> +       /* store the SPI (in bigendian) and IPidx */
>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXSPI, spi);
>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXIPIDX, ip_idx);
>> +       IXGBE_WRITE_FLUSH(hw);
>> +
>> +       ixgbe_ipsec_set_rx_item(hw, idx, IXGBE_RXIDX_TBL_SPI);
>> +
>> +       /* store the key, salt, and mode */
>> +       for (i = 0; i < 4; i++)
>> +               IXGBE_WRITE_REG(hw, IXGBE_IPSRXKEY(i), cpu_to_be32(key[3-i]));
>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXSALT, cpu_to_be32(salt));
>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXMOD, mode);
>> +       IXGBE_WRITE_FLUSH(hw);
>> +
>> +       ixgbe_ipsec_set_rx_item(hw, idx, IXGBE_RXIDX_TBL_KEY);
>> +}
> 
> Is there any reason why you could write the SPI, key, salt, and mode,
> then flush, and trigger the writes via the IPSRXIDX? Just wondering
> since it would likely save you a few cycles avoiding PCIe bus stalls.

See note above about religiously flushing everything to make a 
persnickety chip work.
> 
>> +
>> +/**
>> + * ixgbe_ipsec_set_rx_ip - set up the register bits to save SA IP addr info
>> + * @hw: hw specific details
>> + * @idx: register index to write
>> + * @addr: IP address byte array
>> + **/
>> +static void ixgbe_ipsec_set_rx_ip(struct ixgbe_hw *hw, u16 idx, u32 addr[])
>> +{
>> +       int i;
>> +
>> +       /* store the ip address */
>> +       for (i = 0; i < 4; i++)
>> +               IXGBE_WRITE_REG(hw, IXGBE_IPSRXIPADDR(i), addr[i]);
>> +       IXGBE_WRITE_FLUSH(hw);
>> +
>> +       ixgbe_ipsec_set_rx_item(hw, idx, IXGBE_RXIDX_TBL_IP);
>> +}
>> +
> 
> This piece is kind of confusing. I would suggest storing the address
> as a __be32 pointer instead of a u32 array. That way you start with
> either an IPv6 or an IPv4 address at offset 0 instead of the way the
> hardware is defined which has you writing it at either 0 or 3
> depending on if the address is IPv6 or IPv4.

Using a __be32 rather than u32 is fine here, it doesn't make much 
difference.

If I understand your suggestion correctly, we would also need an 
additional function parameter to tell us if we were pointing to an ipv6 
or ipv4 address.  Since the driver's SW tables are modeling the HW, I 
think it is simpler to leave it in the array.

> 
>> +/**
>> + * ixgbe_ipsec_clear_hw_tables - because some tables don't get cleared on reset
>> + * @adapter: board private structure
>> + **/
>> +void ixgbe_ipsec_clear_hw_tables(struct ixgbe_adapter *adapter)
>> +{
>> +       struct ixgbe_hw *hw = &adapter->hw;
>> +       u32 buf[4] = {0, 0, 0, 0};
>> +       u16 idx;
>> +
>> +       /* disable Rx and Tx SA lookup */
>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXIDX, 0);
>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSTXIDX, 0);
>> +
>> +       /* scrub the tables */
>> +       for (idx = 0; idx < IXGBE_IPSEC_MAX_SA_COUNT; idx++)
>> +               ixgbe_ipsec_set_tx_sa(hw, idx, buf, 0);
>> +
>> +       for (idx = 0; idx < IXGBE_IPSEC_MAX_SA_COUNT; idx++)
>> +               ixgbe_ipsec_set_rx_sa(hw, idx, 0, buf, 0, 0, 0);
>> +
>> +       for (idx = 0; idx < IXGBE_IPSEC_MAX_RX_IP_COUNT; idx++)
>> +               ixgbe_ipsec_set_rx_ip(hw, idx, buf);
>> +}
>> +
>> +/**
>> + * ixgbe_init_ipsec_offload - initialize security registers for IPSec operation
>> + * @adapter: board private structure
>> + **/
>> +void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter)
>> +{
>> +       ixgbe_ipsec_clear_hw_tables(adapter);
>> +}
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
>> new file mode 100644
>> index 0000000..017b13f
>> --- /dev/null
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
>> @@ -0,0 +1,50 @@
>> +/*******************************************************************************
>> +
>> +  Intel 10 Gigabit PCI Express Linux driver
>> +  Copyright(c) 2017 Oracle and/or its affiliates. All rights reserved.
>> +
>> +  This program is free software; you can redistribute it and/or modify it
>> +  under the terms and conditions of the GNU General Public License,
>> +  version 2, as published by the Free Software Foundation.
>> +
>> +  This program is distributed in the hope it will be useful, but WITHOUT
>> +  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> +  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>> +  more details.
>> +
>> +  You should have received a copy of the GNU General Public License along with
>> +  this program.  If not, see <http://www.gnu.org/licenses/>.
>> +
>> +  The full GNU General Public License is included in this distribution in
>> +  the file called "COPYING".
>> +
>> +  Contact Information:
>> +  Linux NICS <linux.nics@intel.com>
>> +  e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
>> +  Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
>> +
>> +*******************************************************************************/
>> +
>> +#ifndef _IXGBE_IPSEC_H_
>> +#define _IXGBE_IPSEC_H_
>> +
>> +#define IXGBE_IPSEC_MAX_SA_COUNT       1024
>> +#define IXGBE_IPSEC_MAX_RX_IP_COUNT    128
>> +#define IXGBE_IPSEC_BASE_RX_INDEX      IXGBE_IPSEC_MAX_SA_COUNT
>> +#define IXGBE_IPSEC_BASE_TX_INDEX      (IXGBE_IPSEC_MAX_SA_COUNT * 2)
>> +
>> +#define IXGBE_RXTXIDX_IPS_EN           0x00000001
>> +#define IXGBE_RXIDX_TBL_MASK           0x00000006
>> +#define IXGBE_RXIDX_TBL_IP             0x00000002
>> +#define IXGBE_RXIDX_TBL_SPI            0x00000004
>> +#define IXGBE_RXIDX_TBL_KEY            0x00000006
> 
> You might look at converting these table entries into an enum and add
> a shift value. It will make things much easier to read.
> 
>> +#define IXGBE_RXTXIDX_IDX_MASK         0x00001ff8
>> +#define IXGBE_RXTXIDX_IDX_READ         0x40000000
>> +#define IXGBE_RXTXIDX_IDX_WRITE                0x80000000
>> +
>> +#define IXGBE_RXMOD_VALID              0x00000001
>> +#define IXGBE_RXMOD_PROTO_ESP          0x00000004
>> +#define IXGBE_RXMOD_DECRYPT            0x00000008
>> +#define IXGBE_RXMOD_IPV6               0x00000010
>> +
>> +#endif /* _IXGBE_IPSEC_H_ */
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> index 6d5f31e..51fb3cf 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> @@ -10327,6 +10327,7 @@ static int ixgbe_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>>                                           NETIF_F_FCOE_MTU;
>>          }
>>   #endif /* IXGBE_FCOE */
>> +       ixgbe_init_ipsec_offload(adapter);
>>
>>          if (adapter->flags2 & IXGBE_FLAG2_RSC_CAPABLE)
>>                  netdev->hw_features |= NETIF_F_LRO;
>> --
>> 2.7.4
>>
>> _______________________________________________
>> Intel-wired-lan mailing list
>> Intel-wired-lan@osuosl.org
>> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 02/10] ixgbe: add ipsec register access routines
@ 2017-12-07  5:43       ` Shannon Nelson
  0 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-07  5:43 UTC (permalink / raw)
  To: intel-wired-lan

Thanks, Alex, for your detailed comments, I do appreciate the time and 
thought you put into them.

Responses below...

sln

On 12/5/2017 8:56 AM, Alexander Duyck wrote:
> On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
> <shannon.nelson@oracle.com> wrote:
>> Add a few routines to make access to the ipsec registers just a little
>> easier, and throw in the beginnings of an initialization.
>>
>> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
>> ---
>>   drivers/net/ethernet/intel/ixgbe/Makefile      |   1 +
>>   drivers/net/ethernet/intel/ixgbe/ixgbe.h       |   6 +
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 157 +++++++++++++++++++++++++
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h |  50 ++++++++
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  |   1 +
>>   5 files changed, 215 insertions(+)
>>   create mode 100644 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>   create mode 100644 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
>>
>> diff --git a/drivers/net/ethernet/intel/ixgbe/Makefile b/drivers/net/ethernet/intel/ixgbe/Makefile
>> index 35e6fa6..8319465 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/Makefile
>> +++ b/drivers/net/ethernet/intel/ixgbe/Makefile
>> @@ -42,3 +42,4 @@ ixgbe-$(CONFIG_IXGBE_DCB) +=  ixgbe_dcb.o ixgbe_dcb_82598.o \
>>   ixgbe-$(CONFIG_IXGBE_HWMON) += ixgbe_sysfs.o
>>   ixgbe-$(CONFIG_DEBUG_FS) += ixgbe_debugfs.o
>>   ixgbe-$(CONFIG_FCOE:m=y) += ixgbe_fcoe.o
>> +ixgbe-$(CONFIG_XFRM_OFFLOAD) += ixgbe_ipsec.o
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> index dd55787..1e11462 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> @@ -52,6 +52,7 @@
>>   #ifdef CONFIG_IXGBE_DCA
>>   #include <linux/dca.h>
>>   #endif
>> +#include "ixgbe_ipsec.h"
>>
>>   #include <net/busy_poll.h>
>>
>> @@ -1001,4 +1002,9 @@ void ixgbe_store_key(struct ixgbe_adapter *adapter);
>>   void ixgbe_store_reta(struct ixgbe_adapter *adapter);
>>   s32 ixgbe_negotiate_fc(struct ixgbe_hw *hw, u32 adv_reg, u32 lp_reg,
>>                         u32 adv_sym, u32 adv_asm, u32 lp_sym, u32 lp_asm);
>> +#ifdef CONFIG_XFRM_OFFLOAD
>> +void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter);
>> +#else
>> +static inline void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter) { };
>> +#endif /* CONFIG_XFRM_OFFLOAD */
>>   #endif /* _IXGBE_H_ */
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> new file mode 100644
>> index 0000000..14dd011
>> --- /dev/null
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> @@ -0,0 +1,157 @@
>> +/*******************************************************************************
>> + *
>> + * Intel 10 Gigabit PCI Express Linux driver
>> + * Copyright(c) 2017 Oracle and/or its affiliates. All rights reserved.
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms and conditions of the GNU General Public License,
>> + * version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope it will be useful, but WITHOUT
>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>> + * more details.
>> + *
>> + * You should have received a copy of the GNU General Public License along with
>> + * this program.  If not, see <http://www.gnu.org/licenses/>.
>> + *
>> + * The full GNU General Public License is included in this distribution in
>> + * the file called "COPYING".
>> + *
>> + * Contact Information:
>> + * Linux NICS <linux.nics@intel.com>
>> + * e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
>> + * Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
>> + *
>> + ******************************************************************************/
>> +
>> +#include "ixgbe.h"
>> +
>> +/**
>> + * ixgbe_ipsec_set_tx_sa - set the Tx SA registers
>> + * @hw: hw specific details
>> + * @idx: register index to write
>> + * @key: key byte array
>> + * @salt: salt bytes
>> + **/
>> +static void ixgbe_ipsec_set_tx_sa(struct ixgbe_hw *hw, u16 idx,
>> +                                 u32 key[], u32 salt)
>> +{
>> +       u32 reg;
>> +       int i;
>> +
>> +       for (i = 0; i < 4; i++)
>> +               IXGBE_WRITE_REG(hw, IXGBE_IPSTXKEY(i), cpu_to_be32(key[3-i]));
>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSTXSALT, cpu_to_be32(salt));
>> +       IXGBE_WRITE_FLUSH(hw);
>> +
>> +       reg = IXGBE_READ_REG(hw, IXGBE_IPSTXIDX);
>> +       reg &= IXGBE_RXTXIDX_IPS_EN;
>> +       reg |= idx << 3 | IXGBE_RXTXIDX_IDX_WRITE;
>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSTXIDX, reg);
>> +       IXGBE_WRITE_FLUSH(hw);
>> +}
>> +
> 
> So there are a few things here to unpack.
> 
> The first is the carry-forward of the IPS bit. I'm not sure that is
> the best way to go. Do we really expect to be updating SA values if
> IPsec offload is not enabled?

In order to save on energy, we don't enable the engine until we have the 
first SA successfully stored in the tables, so the enable bit will be 
off for that one.

Also, the datasheet specifically says for the Rx table "Software should 
not make changes in the Rx SA tables while changing the IPSEC_EN bit." 
I figured I'd use the same method on both tables for consistency.

> If so we may just want to carry a bit
> flag somewhere in the ixgbe_hw struct indicating if Tx IPsec offload
> is enabled and use that to determine the value for this bit.
> 
> Also we should probably replace "3" with a value indicating that it is
> the SA index shift.

Sure, that would be good.

> 
> Also technically the WRITE_FLUSH isn't needed if you are doing a PCIe
> read anyway to get IPSTXIDX.

That's from having to be very fastidious about these 
reads/writes/flushes before the engine actually worked for me.  I could 
spend time taking them out and testing each change again, but they 
aren't in a fast path, so I'm really not worried about it.

> 
>> +/**
>> + * ixgbe_ipsec_set_rx_item - set an Rx table item
>> + * @hw: hw specific details
>> + * @idx: register index to write
>> + * @tbl: table selector
>> + *
>> + * Trigger the device to store into a particular Rx table the
>> + * data that has already been loaded into the input register
>> + **/
>> +static void ixgbe_ipsec_set_rx_item(struct ixgbe_hw *hw, u16 idx, u32 tbl)
>> +{
>> +       u32 reg;
>> +
>> +       reg = IXGBE_READ_REG(hw, IXGBE_IPSRXIDX);
>> +       reg &= IXGBE_RXTXIDX_IPS_EN;
>> +       reg |= tbl | idx << 3 | IXGBE_RXTXIDX_IDX_WRITE;
>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXIDX, reg);
>> +       IXGBE_WRITE_FLUSH(hw);
>> +}
>> +
> 
> The Rx version of this gets a bit trickier since the datasheet
> actually indicates there are a few different types of tables that can
> be indexed via this. Also why is the tbl value not being shifted? It
> seems like it should be shifted by 1 to avoid overwriting the IPS_EN
> bit. Really I would like to see the tbl value converted to an enum and
> shifted by 1 in order to generate the table reference.

I would have done this, but we can't use an enum shifted bit because the 
field values are 01, 10, and 11.  I used the direct 2, 4, and 6 values 
rather than shifting by one, but I can reset them and shift by 1.

> 
> Here the "3" is a table index. It might be nice to call that out with
> a name instead of using the magic number.

Yep

> 
>> +/**
>> + * ixgbe_ipsec_set_rx_sa - set up the register bits to save SA info
>> + * @hw: hw specific details
>> + * @idx: register index to write
>> + * @spi: security parameter index
>> + * @key: key byte array
>> + * @salt: salt bytes
>> + * @mode: rx decrypt control bits
>> + * @ip_idx: index into IP table for related IP address
>> + **/
>> +static void ixgbe_ipsec_set_rx_sa(struct ixgbe_hw *hw, u16 idx, __be32 spi,
>> +                                 u32 key[], u32 salt, u32 mode, u32 ip_idx)
>> +{
>> +       int i;
>> +
>> +       /* store the SPI (in bigendian) and IPidx */
>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXSPI, spi);
>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXIPIDX, ip_idx);
>> +       IXGBE_WRITE_FLUSH(hw);
>> +
>> +       ixgbe_ipsec_set_rx_item(hw, idx, IXGBE_RXIDX_TBL_SPI);
>> +
>> +       /* store the key, salt, and mode */
>> +       for (i = 0; i < 4; i++)
>> +               IXGBE_WRITE_REG(hw, IXGBE_IPSRXKEY(i), cpu_to_be32(key[3-i]));
>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXSALT, cpu_to_be32(salt));
>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXMOD, mode);
>> +       IXGBE_WRITE_FLUSH(hw);
>> +
>> +       ixgbe_ipsec_set_rx_item(hw, idx, IXGBE_RXIDX_TBL_KEY);
>> +}
> 
> Is there any reason why you could write the SPI, key, salt, and mode,
> then flush, and trigger the writes via the IPSRXIDX? Just wondering
> since it would likely save you a few cycles avoiding PCIe bus stalls.

See note above about religiously flushing everything to make a 
persnickety chip work.
> 
>> +
>> +/**
>> + * ixgbe_ipsec_set_rx_ip - set up the register bits to save SA IP addr info
>> + * @hw: hw specific details
>> + * @idx: register index to write
>> + * @addr: IP address byte array
>> + **/
>> +static void ixgbe_ipsec_set_rx_ip(struct ixgbe_hw *hw, u16 idx, u32 addr[])
>> +{
>> +       int i;
>> +
>> +       /* store the ip address */
>> +       for (i = 0; i < 4; i++)
>> +               IXGBE_WRITE_REG(hw, IXGBE_IPSRXIPADDR(i), addr[i]);
>> +       IXGBE_WRITE_FLUSH(hw);
>> +
>> +       ixgbe_ipsec_set_rx_item(hw, idx, IXGBE_RXIDX_TBL_IP);
>> +}
>> +
> 
> This piece is kind of confusing. I would suggest storing the address
> as a __be32 pointer instead of a u32 array. That way you start with
> either an IPv6 or an IPv4 address at offset 0 instead of the way the
> hardware is defined which has you writing it at either 0 or 3
> depending on if the address is IPv6 or IPv4.

Using a __be32 rather than u32 is fine here, it doesn't make much 
difference.

If I understand your suggestion correctly, we would also need an 
additional function parameter to tell us if we were pointing to an ipv6 
or ipv4 address.  Since the driver's SW tables are modeling the HW, I 
think it is simpler to leave it in the array.

> 
>> +/**
>> + * ixgbe_ipsec_clear_hw_tables - because some tables don't get cleared on reset
>> + * @adapter: board private structure
>> + **/
>> +void ixgbe_ipsec_clear_hw_tables(struct ixgbe_adapter *adapter)
>> +{
>> +       struct ixgbe_hw *hw = &adapter->hw;
>> +       u32 buf[4] = {0, 0, 0, 0};
>> +       u16 idx;
>> +
>> +       /* disable Rx and Tx SA lookup */
>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXIDX, 0);
>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSTXIDX, 0);
>> +
>> +       /* scrub the tables */
>> +       for (idx = 0; idx < IXGBE_IPSEC_MAX_SA_COUNT; idx++)
>> +               ixgbe_ipsec_set_tx_sa(hw, idx, buf, 0);
>> +
>> +       for (idx = 0; idx < IXGBE_IPSEC_MAX_SA_COUNT; idx++)
>> +               ixgbe_ipsec_set_rx_sa(hw, idx, 0, buf, 0, 0, 0);
>> +
>> +       for (idx = 0; idx < IXGBE_IPSEC_MAX_RX_IP_COUNT; idx++)
>> +               ixgbe_ipsec_set_rx_ip(hw, idx, buf);
>> +}
>> +
>> +/**
>> + * ixgbe_init_ipsec_offload - initialize security registers for IPSec operation
>> + * @adapter: board private structure
>> + **/
>> +void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter)
>> +{
>> +       ixgbe_ipsec_clear_hw_tables(adapter);
>> +}
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
>> new file mode 100644
>> index 0000000..017b13f
>> --- /dev/null
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
>> @@ -0,0 +1,50 @@
>> +/*******************************************************************************
>> +
>> +  Intel 10 Gigabit PCI Express Linux driver
>> +  Copyright(c) 2017 Oracle and/or its affiliates. All rights reserved.
>> +
>> +  This program is free software; you can redistribute it and/or modify it
>> +  under the terms and conditions of the GNU General Public License,
>> +  version 2, as published by the Free Software Foundation.
>> +
>> +  This program is distributed in the hope it will be useful, but WITHOUT
>> +  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> +  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>> +  more details.
>> +
>> +  You should have received a copy of the GNU General Public License along with
>> +  this program.  If not, see <http://www.gnu.org/licenses/>.
>> +
>> +  The full GNU General Public License is included in this distribution in
>> +  the file called "COPYING".
>> +
>> +  Contact Information:
>> +  Linux NICS <linux.nics@intel.com>
>> +  e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
>> +  Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
>> +
>> +*******************************************************************************/
>> +
>> +#ifndef _IXGBE_IPSEC_H_
>> +#define _IXGBE_IPSEC_H_
>> +
>> +#define IXGBE_IPSEC_MAX_SA_COUNT       1024
>> +#define IXGBE_IPSEC_MAX_RX_IP_COUNT    128
>> +#define IXGBE_IPSEC_BASE_RX_INDEX      IXGBE_IPSEC_MAX_SA_COUNT
>> +#define IXGBE_IPSEC_BASE_TX_INDEX      (IXGBE_IPSEC_MAX_SA_COUNT * 2)
>> +
>> +#define IXGBE_RXTXIDX_IPS_EN           0x00000001
>> +#define IXGBE_RXIDX_TBL_MASK           0x00000006
>> +#define IXGBE_RXIDX_TBL_IP             0x00000002
>> +#define IXGBE_RXIDX_TBL_SPI            0x00000004
>> +#define IXGBE_RXIDX_TBL_KEY            0x00000006
> 
> You might look at converting these table entries into an enum and add
> a shift value. It will make things much easier to read.
> 
>> +#define IXGBE_RXTXIDX_IDX_MASK         0x00001ff8
>> +#define IXGBE_RXTXIDX_IDX_READ         0x40000000
>> +#define IXGBE_RXTXIDX_IDX_WRITE                0x80000000
>> +
>> +#define IXGBE_RXMOD_VALID              0x00000001
>> +#define IXGBE_RXMOD_PROTO_ESP          0x00000004
>> +#define IXGBE_RXMOD_DECRYPT            0x00000008
>> +#define IXGBE_RXMOD_IPV6               0x00000010
>> +
>> +#endif /* _IXGBE_IPSEC_H_ */
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> index 6d5f31e..51fb3cf 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> @@ -10327,6 +10327,7 @@ static int ixgbe_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>>                                           NETIF_F_FCOE_MTU;
>>          }
>>   #endif /* IXGBE_FCOE */
>> +       ixgbe_init_ipsec_offload(adapter);
>>
>>          if (adapter->flags2 & IXGBE_FLAG2_RSC_CAPABLE)
>>                  netdev->hw_features |= NETIF_F_LRO;
>> --
>> 2.7.4
>>
>> _______________________________________________
>> Intel-wired-lan mailing list
>> Intel-wired-lan at osuosl.org
>> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [Intel-wired-lan] [next-queue 03/10] ixgbe: add ipsec engine start and stop routines
  2017-12-05 16:22     ` Alexander Duyck
@ 2017-12-07  5:43       ` Shannon Nelson
  -1 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-07  5:43 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: intel-wired-lan, Jeff Kirsher, Steffen Klassert,
	Sowmini Varadhan, Netdev

On 12/5/2017 8:22 AM, Alexander Duyck wrote:
> On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
> <shannon.nelson@oracle.com> wrote:
>> Add in the code for running and stopping the hardware ipsec
>> encryption/decryption engine.  It is good to keep the engine
>> off when not in use in order to save on the power draw.
>>
>> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
>> ---
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 140 +++++++++++++++++++++++++
>>   1 file changed, 140 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> index 14dd011..38a1a16 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> @@ -148,10 +148,150 @@ void ixgbe_ipsec_clear_hw_tables(struct ixgbe_adapter *adapter)
>>   }
>>
>>   /**
>> + * ixgbe_ipsec_stop_data
>> + * @adapter: board private structure
>> + **/
>> +static void ixgbe_ipsec_stop_data(struct ixgbe_adapter *adapter)
>> +{
>> +       struct ixgbe_hw *hw = &adapter->hw;
>> +       bool link = adapter->link_up;
>> +       u32 t_rdy, r_rdy;
>> +       u32 reg;
>> +
>> +       /* halt data paths */
>> +       reg = IXGBE_READ_REG(hw, IXGBE_SECTXCTRL);
>> +       reg |= IXGBE_SECTXCTRL_TX_DIS;
>> +       IXGBE_WRITE_REG(hw, IXGBE_SECTXCTRL, reg);
>> +
>> +       reg = IXGBE_READ_REG(hw, IXGBE_SECRXCTRL);
>> +       reg |= IXGBE_SECRXCTRL_RX_DIS;
>> +       IXGBE_WRITE_REG(hw, IXGBE_SECRXCTRL, reg);
>> +
>> +       IXGBE_WRITE_FLUSH(hw);
>> +
>> +       /* If the tx fifo doesn't have link, but still has data,
>> +        * we can't clear the tx sec block.  Set the MAC loopback
>> +        * before block clear
>> +        */
>> +       if (!link) {
>> +               reg = IXGBE_READ_REG(hw, IXGBE_MACC);
>> +               reg |= IXGBE_MACC_FLU;
>> +               IXGBE_WRITE_REG(hw, IXGBE_MACC, reg);
>> +
>> +               reg = IXGBE_READ_REG(hw, IXGBE_HLREG0);
>> +               reg |= IXGBE_HLREG0_LPBK;
>> +               IXGBE_WRITE_REG(hw, IXGBE_HLREG0, reg);
>> +
>> +               IXGBE_WRITE_FLUSH(hw);
>> +               mdelay(3);
>> +       }
>> +
>> +       /* wait for the paths to empty */
>> +       do {
>> +               mdelay(10);
>> +               t_rdy = IXGBE_READ_REG(hw, IXGBE_SECTXSTAT) &
>> +                       IXGBE_SECTXSTAT_SECTX_RDY;
>> +               r_rdy = IXGBE_READ_REG(hw, IXGBE_SECRXSTAT) &
>> +                       IXGBE_SECRXSTAT_SECRX_RDY;
>> +       } while (!t_rdy && !r_rdy);
> 
> This piece seems buggy to me. There should be some sort of limit on
> how long you are willing to delay. Otherwise a surprise remove can
> cause this to spin forever when the register reads return all 1's.

Yep - I had meant to limit that.  Thanks.

> 
>> +
>> +       /* undo loopback if we played with it earlier */
>> +       if (!link) {
>> +               reg = IXGBE_READ_REG(hw, IXGBE_MACC);
>> +               reg &= ~IXGBE_MACC_FLU;
>> +               IXGBE_WRITE_REG(hw, IXGBE_MACC, reg);
>> +
>> +               reg = IXGBE_READ_REG(hw, IXGBE_HLREG0);
>> +               reg &= ~IXGBE_HLREG0_LPBK;
>> +               IXGBE_WRITE_REG(hw, IXGBE_HLREG0, reg);
>> +
>> +               IXGBE_WRITE_FLUSH(hw);
>> +       }
>> +}
>> +
>> +/**
>> + * ixgbe_ipsec_stop_engine
>> + * @adapter: board private structure
>> + **/
>> +static void ixgbe_ipsec_stop_engine(struct ixgbe_adapter *adapter)
>> +{
>> +       struct ixgbe_hw *hw = &adapter->hw;
>> +       u32 reg;
>> +
>> +       ixgbe_ipsec_stop_data(adapter);
>> +
>> +       /* disable Rx and Tx SA lookup */
>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSTXIDX, 0);
>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXIDX, 0);
>> +
>> +       /* disable the Rx and Tx engines and full packet store-n-forward */
>> +       reg = IXGBE_READ_REG(hw, IXGBE_SECTXCTRL);
>> +       reg |= IXGBE_SECTXCTRL_SECTX_DIS;
>> +       reg &= ~IXGBE_SECTXCTRL_STORE_FORWARD;
>> +       IXGBE_WRITE_REG(hw, IXGBE_SECTXCTRL, reg);
>> +
>> +       reg = IXGBE_READ_REG(hw, IXGBE_SECRXCTRL);
>> +       reg |= IXGBE_SECRXCTRL_SECRX_DIS;
>> +       IXGBE_WRITE_REG(hw, IXGBE_SECRXCTRL, reg);
>> +
>> +       /* restore the "tx security buffer almost full threshold" to 0x250 */
>> +       IXGBE_WRITE_REG(hw, IXGBE_SECTXBUFFAF, 0x250);
>> +
>> +       /* Set minimum IFG between packets back to the default 0x1 */
>> +       reg = IXGBE_READ_REG(hw, IXGBE_SECTXMINIFG);
>> +       reg = (reg & 0xfffffff0) | 0x1;
>> +       IXGBE_WRITE_REG(hw, IXGBE_SECTXMINIFG, reg);
>> +
>> +       /* final set for normal (no ipsec offload) processing */
>> +       IXGBE_WRITE_REG(hw, IXGBE_SECTXCTRL, IXGBE_SECTXCTRL_SECTX_DIS);
>> +       IXGBE_WRITE_REG(hw, IXGBE_SECRXCTRL, IXGBE_SECRXCTRL_SECRX_DIS);
>> +
>> +       IXGBE_WRITE_FLUSH(hw);
>> +}
>> +
>> +/**
>> + * ixgbe_ipsec_start_engine
>> + * @adapter: board private structure
>> + *
>> + * NOTE: this increases power consumption whether being used or not
>> + **/
>> +static void ixgbe_ipsec_start_engine(struct ixgbe_adapter *adapter)
>> +{
>> +       struct ixgbe_hw *hw = &adapter->hw;
>> +       u32 reg;
>> +
>> +       ixgbe_ipsec_stop_data(adapter);
>> +
>> +       /* Set minimum IFG between packets to 3 */
>> +       reg = IXGBE_READ_REG(hw, IXGBE_SECTXMINIFG);
>> +       reg = (reg & 0xfffffff0) | 0x3;
>> +       IXGBE_WRITE_REG(hw, IXGBE_SECTXMINIFG, reg);
>> +
>> +       /* Set "tx security buffer almost full threshold" to 0x15 so that the
>> +        * almost full indication is generated only after buffer contains at
>> +        * least an entire jumbo packet.
>> +        */
>> +       reg = IXGBE_READ_REG(hw, IXGBE_SECTXBUFFAF);
>> +       reg = (reg & 0xfffffc00) | 0x15;
>> +       IXGBE_WRITE_REG(hw, IXGBE_SECTXBUFFAF, reg);
>> +
>> +       /* restart the data paths by clearing the DISABLE bits */
>> +       IXGBE_WRITE_REG(hw, IXGBE_SECRXCTRL, 0);
>> +       IXGBE_WRITE_REG(hw, IXGBE_SECTXCTRL, IXGBE_SECTXCTRL_STORE_FORWARD);
>> +
>> +       /* enable Rx and Tx SA lookup */
>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSTXIDX, IXGBE_RXTXIDX_IPS_EN);
>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXIDX, IXGBE_RXTXIDX_IPS_EN);
>> +
>> +       IXGBE_WRITE_FLUSH(hw);
>> +}
>> +
> 
> It would probably make sense to add a data member to the hardware
> structure that tracks if you have IPsec enabled or not. Then you don't
> have to track the IPS_EN bits in patch 2 like you currently are and
> could instead either not do IPsec SA updates if IPsec is not enabled,
> or use the enable value to determine what you write for IPS_EN instead
> of having to read registers.

As I responded earlier, the datasheet says to be sure to not change the 
EN bit while writing the Rx tables.  I'm going to use the same logic on 
both Tx and Rx tables.

sln

> 
>> +/**
>>    * ixgbe_init_ipsec_offload - initialize security registers for IPSec operation
>>    * @adapter: board private structure
>>    **/
>>   void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter)
>>   {
>>          ixgbe_ipsec_clear_hw_tables(adapter);
>> +       ixgbe_ipsec_stop_engine(adapter);
>>   }
>> --
>> 2.7.4
>>
>> _______________________________________________
>> Intel-wired-lan mailing list
>> Intel-wired-lan@osuosl.org
>> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 03/10] ixgbe: add ipsec engine start and stop routines
@ 2017-12-07  5:43       ` Shannon Nelson
  0 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-07  5:43 UTC (permalink / raw)
  To: intel-wired-lan

On 12/5/2017 8:22 AM, Alexander Duyck wrote:
> On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
> <shannon.nelson@oracle.com> wrote:
>> Add in the code for running and stopping the hardware ipsec
>> encryption/decryption engine.  It is good to keep the engine
>> off when not in use in order to save on the power draw.
>>
>> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
>> ---
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 140 +++++++++++++++++++++++++
>>   1 file changed, 140 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> index 14dd011..38a1a16 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> @@ -148,10 +148,150 @@ void ixgbe_ipsec_clear_hw_tables(struct ixgbe_adapter *adapter)
>>   }
>>
>>   /**
>> + * ixgbe_ipsec_stop_data
>> + * @adapter: board private structure
>> + **/
>> +static void ixgbe_ipsec_stop_data(struct ixgbe_adapter *adapter)
>> +{
>> +       struct ixgbe_hw *hw = &adapter->hw;
>> +       bool link = adapter->link_up;
>> +       u32 t_rdy, r_rdy;
>> +       u32 reg;
>> +
>> +       /* halt data paths */
>> +       reg = IXGBE_READ_REG(hw, IXGBE_SECTXCTRL);
>> +       reg |= IXGBE_SECTXCTRL_TX_DIS;
>> +       IXGBE_WRITE_REG(hw, IXGBE_SECTXCTRL, reg);
>> +
>> +       reg = IXGBE_READ_REG(hw, IXGBE_SECRXCTRL);
>> +       reg |= IXGBE_SECRXCTRL_RX_DIS;
>> +       IXGBE_WRITE_REG(hw, IXGBE_SECRXCTRL, reg);
>> +
>> +       IXGBE_WRITE_FLUSH(hw);
>> +
>> +       /* If the tx fifo doesn't have link, but still has data,
>> +        * we can't clear the tx sec block.  Set the MAC loopback
>> +        * before block clear
>> +        */
>> +       if (!link) {
>> +               reg = IXGBE_READ_REG(hw, IXGBE_MACC);
>> +               reg |= IXGBE_MACC_FLU;
>> +               IXGBE_WRITE_REG(hw, IXGBE_MACC, reg);
>> +
>> +               reg = IXGBE_READ_REG(hw, IXGBE_HLREG0);
>> +               reg |= IXGBE_HLREG0_LPBK;
>> +               IXGBE_WRITE_REG(hw, IXGBE_HLREG0, reg);
>> +
>> +               IXGBE_WRITE_FLUSH(hw);
>> +               mdelay(3);
>> +       }
>> +
>> +       /* wait for the paths to empty */
>> +       do {
>> +               mdelay(10);
>> +               t_rdy = IXGBE_READ_REG(hw, IXGBE_SECTXSTAT) &
>> +                       IXGBE_SECTXSTAT_SECTX_RDY;
>> +               r_rdy = IXGBE_READ_REG(hw, IXGBE_SECRXSTAT) &
>> +                       IXGBE_SECRXSTAT_SECRX_RDY;
>> +       } while (!t_rdy && !r_rdy);
> 
> This piece seems buggy to me. There should be some sort of limit on
> how long you are willing to delay. Otherwise a surprise remove can
> cause this to spin forever when the register reads return all 1's.

Yep - I had meant to limit that.  Thanks.

> 
>> +
>> +       /* undo loopback if we played with it earlier */
>> +       if (!link) {
>> +               reg = IXGBE_READ_REG(hw, IXGBE_MACC);
>> +               reg &= ~IXGBE_MACC_FLU;
>> +               IXGBE_WRITE_REG(hw, IXGBE_MACC, reg);
>> +
>> +               reg = IXGBE_READ_REG(hw, IXGBE_HLREG0);
>> +               reg &= ~IXGBE_HLREG0_LPBK;
>> +               IXGBE_WRITE_REG(hw, IXGBE_HLREG0, reg);
>> +
>> +               IXGBE_WRITE_FLUSH(hw);
>> +       }
>> +}
>> +
>> +/**
>> + * ixgbe_ipsec_stop_engine
>> + * @adapter: board private structure
>> + **/
>> +static void ixgbe_ipsec_stop_engine(struct ixgbe_adapter *adapter)
>> +{
>> +       struct ixgbe_hw *hw = &adapter->hw;
>> +       u32 reg;
>> +
>> +       ixgbe_ipsec_stop_data(adapter);
>> +
>> +       /* disable Rx and Tx SA lookup */
>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSTXIDX, 0);
>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXIDX, 0);
>> +
>> +       /* disable the Rx and Tx engines and full packet store-n-forward */
>> +       reg = IXGBE_READ_REG(hw, IXGBE_SECTXCTRL);
>> +       reg |= IXGBE_SECTXCTRL_SECTX_DIS;
>> +       reg &= ~IXGBE_SECTXCTRL_STORE_FORWARD;
>> +       IXGBE_WRITE_REG(hw, IXGBE_SECTXCTRL, reg);
>> +
>> +       reg = IXGBE_READ_REG(hw, IXGBE_SECRXCTRL);
>> +       reg |= IXGBE_SECRXCTRL_SECRX_DIS;
>> +       IXGBE_WRITE_REG(hw, IXGBE_SECRXCTRL, reg);
>> +
>> +       /* restore the "tx security buffer almost full threshold" to 0x250 */
>> +       IXGBE_WRITE_REG(hw, IXGBE_SECTXBUFFAF, 0x250);
>> +
>> +       /* Set minimum IFG between packets back to the default 0x1 */
>> +       reg = IXGBE_READ_REG(hw, IXGBE_SECTXMINIFG);
>> +       reg = (reg & 0xfffffff0) | 0x1;
>> +       IXGBE_WRITE_REG(hw, IXGBE_SECTXMINIFG, reg);
>> +
>> +       /* final set for normal (no ipsec offload) processing */
>> +       IXGBE_WRITE_REG(hw, IXGBE_SECTXCTRL, IXGBE_SECTXCTRL_SECTX_DIS);
>> +       IXGBE_WRITE_REG(hw, IXGBE_SECRXCTRL, IXGBE_SECRXCTRL_SECRX_DIS);
>> +
>> +       IXGBE_WRITE_FLUSH(hw);
>> +}
>> +
>> +/**
>> + * ixgbe_ipsec_start_engine
>> + * @adapter: board private structure
>> + *
>> + * NOTE: this increases power consumption whether being used or not
>> + **/
>> +static void ixgbe_ipsec_start_engine(struct ixgbe_adapter *adapter)
>> +{
>> +       struct ixgbe_hw *hw = &adapter->hw;
>> +       u32 reg;
>> +
>> +       ixgbe_ipsec_stop_data(adapter);
>> +
>> +       /* Set minimum IFG between packets to 3 */
>> +       reg = IXGBE_READ_REG(hw, IXGBE_SECTXMINIFG);
>> +       reg = (reg & 0xfffffff0) | 0x3;
>> +       IXGBE_WRITE_REG(hw, IXGBE_SECTXMINIFG, reg);
>> +
>> +       /* Set "tx security buffer almost full threshold" to 0x15 so that the
>> +        * almost full indication is generated only after buffer contains at
>> +        * least an entire jumbo packet.
>> +        */
>> +       reg = IXGBE_READ_REG(hw, IXGBE_SECTXBUFFAF);
>> +       reg = (reg & 0xfffffc00) | 0x15;
>> +       IXGBE_WRITE_REG(hw, IXGBE_SECTXBUFFAF, reg);
>> +
>> +       /* restart the data paths by clearing the DISABLE bits */
>> +       IXGBE_WRITE_REG(hw, IXGBE_SECRXCTRL, 0);
>> +       IXGBE_WRITE_REG(hw, IXGBE_SECTXCTRL, IXGBE_SECTXCTRL_STORE_FORWARD);
>> +
>> +       /* enable Rx and Tx SA lookup */
>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSTXIDX, IXGBE_RXTXIDX_IPS_EN);
>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXIDX, IXGBE_RXTXIDX_IPS_EN);
>> +
>> +       IXGBE_WRITE_FLUSH(hw);
>> +}
>> +
> 
> It would probably make sense to add a data member to the hardware
> structure that tracks if you have IPsec enabled or not. Then you don't
> have to track the IPS_EN bits in patch 2 like you currently are and
> could instead either not do IPsec SA updates if IPsec is not enabled,
> or use the enable value to determine what you write for IPS_EN instead
> of having to read registers.

As I responded earlier, the datasheet says to be sure to not change the 
EN bit while writing the Rx tables.  I'm going to use the same logic on 
both Tx and Rx tables.

sln

> 
>> +/**
>>    * ixgbe_init_ipsec_offload - initialize security registers for IPSec operation
>>    * @adapter: board private structure
>>    **/
>>   void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter)
>>   {
>>          ixgbe_ipsec_clear_hw_tables(adapter);
>> +       ixgbe_ipsec_stop_engine(adapter);
>>   }
>> --
>> 2.7.4
>>
>> _______________________________________________
>> Intel-wired-lan mailing list
>> Intel-wired-lan at osuosl.org
>> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [Intel-wired-lan] [next-queue 04/10] ixgbe: add ipsec data structures
  2017-12-05 17:03     ` Alexander Duyck
@ 2017-12-07  5:43       ` Shannon Nelson
  -1 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-07  5:43 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: intel-wired-lan, Jeff Kirsher, Steffen Klassert,
	Sowmini Varadhan, Netdev

On 12/5/2017 9:03 AM, Alexander Duyck wrote:
> On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
> <shannon.nelson@oracle.com> wrote:
>> Set up the data structures to be used by the ipsec offload.
>>
>> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
>> ---
>>   drivers/net/ethernet/intel/ixgbe/ixgbe.h       |  5 ++++
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h | 40 ++++++++++++++++++++++++++
>>   2 files changed, 45 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> index 1e11462..9487750 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> @@ -622,6 +622,7 @@ struct ixgbe_adapter {
>>   #define IXGBE_FLAG2_EEE_CAPABLE                        BIT(14)
>>   #define IXGBE_FLAG2_EEE_ENABLED                        BIT(15)
>>   #define IXGBE_FLAG2_RX_LEGACY                  BIT(16)
>> +#define IXGBE_FLAG2_IPSEC_ENABLED              BIT(17)
>>
>>          /* Tx fast path data */
>>          int num_tx_queues;
>> @@ -772,6 +773,10 @@ struct ixgbe_adapter {
>>
>>   #define IXGBE_RSS_KEY_SIZE     40  /* size of RSS Hash Key in bytes */
>>          u32 *rss_key;
>> +
>> +#ifdef CONFIG_XFRM
>> +       struct ixgbe_ipsec *ipsec;
>> +#endif /* CONFIG_XFRM */
>>   };
>>
>>   static inline u8 ixgbe_max_rss_indices(struct ixgbe_adapter *adapter)
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
>> index 017b13f..cb9a4be 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
>> @@ -47,4 +47,44 @@
>>   #define IXGBE_RXMOD_DECRYPT            0x00000008
>>   #define IXGBE_RXMOD_IPV6               0x00000010
>>
>> +struct rx_sa {
>> +       struct hlist_node hlist;
>> +       struct xfrm_state *xs;
>> +       u32 ipaddr[4];
> 
> ipaddr should be stored as a __be32, not a u32.

Yep

> 
>> +       u32 key[4];
>> +       u32 salt;
>> +       u32 mode;
>> +       u8  iptbl_ind;
>> +       bool used;
>> +       bool decrypt;
>> +};
>> +
>> +struct rx_ip_sa {
>> +       u32 ipaddr[4];
> 
> Same thing here.

Yep

> 
>> +       u32 ref_cnt;
>> +       bool used;
>> +};
>> +
>> +struct tx_sa {
>> +       struct xfrm_state *xs;
>> +       u32 key[4];
>> +       u32 salt;
>> +       bool encrypt;
>> +       bool used;
>> +};
>> +
>> +struct ixgbe_ipsec_tx_data {
>> +       u32 flags;
>> +       u16 trailer_len;
>> +       u16 sa_idx;
>> +};
>> +
>> +struct ixgbe_ipsec {
>> +       u16 num_rx_sa;
>> +       u16 num_tx_sa;
>> +       struct rx_ip_sa *ip_tbl;
>> +       struct rx_sa *rx_tbl;
>> +       struct tx_sa *tx_tbl;
>> +       DECLARE_HASHTABLE(rx_sa_list, 8);
> 
> The hash table seems a bit on the small side. You might look at
> increasing this to something like 32 in order to try and cut down on
> the load in each bucket since the upper limit is 1K or so isn't it?

I probably can - didn't want to use too much memory real estate if not 
really needed.  I did a little perf timing with a mostly full table and 
didn't see much degradation.  Maybe boost it to 16?

sln

> 
>> +};
>>   #endif /* _IXGBE_IPSEC_H_ */
>> --
>> 2.7.4
>>
>> _______________________________________________
>> Intel-wired-lan mailing list
>> Intel-wired-lan@osuosl.org
>> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 04/10] ixgbe: add ipsec data structures
@ 2017-12-07  5:43       ` Shannon Nelson
  0 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-07  5:43 UTC (permalink / raw)
  To: intel-wired-lan

On 12/5/2017 9:03 AM, Alexander Duyck wrote:
> On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
> <shannon.nelson@oracle.com> wrote:
>> Set up the data structures to be used by the ipsec offload.
>>
>> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
>> ---
>>   drivers/net/ethernet/intel/ixgbe/ixgbe.h       |  5 ++++
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h | 40 ++++++++++++++++++++++++++
>>   2 files changed, 45 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> index 1e11462..9487750 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> @@ -622,6 +622,7 @@ struct ixgbe_adapter {
>>   #define IXGBE_FLAG2_EEE_CAPABLE                        BIT(14)
>>   #define IXGBE_FLAG2_EEE_ENABLED                        BIT(15)
>>   #define IXGBE_FLAG2_RX_LEGACY                  BIT(16)
>> +#define IXGBE_FLAG2_IPSEC_ENABLED              BIT(17)
>>
>>          /* Tx fast path data */
>>          int num_tx_queues;
>> @@ -772,6 +773,10 @@ struct ixgbe_adapter {
>>
>>   #define IXGBE_RSS_KEY_SIZE     40  /* size of RSS Hash Key in bytes */
>>          u32 *rss_key;
>> +
>> +#ifdef CONFIG_XFRM
>> +       struct ixgbe_ipsec *ipsec;
>> +#endif /* CONFIG_XFRM */
>>   };
>>
>>   static inline u8 ixgbe_max_rss_indices(struct ixgbe_adapter *adapter)
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
>> index 017b13f..cb9a4be 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
>> @@ -47,4 +47,44 @@
>>   #define IXGBE_RXMOD_DECRYPT            0x00000008
>>   #define IXGBE_RXMOD_IPV6               0x00000010
>>
>> +struct rx_sa {
>> +       struct hlist_node hlist;
>> +       struct xfrm_state *xs;
>> +       u32 ipaddr[4];
> 
> ipaddr should be stored as a __be32, not a u32.

Yep

> 
>> +       u32 key[4];
>> +       u32 salt;
>> +       u32 mode;
>> +       u8  iptbl_ind;
>> +       bool used;
>> +       bool decrypt;
>> +};
>> +
>> +struct rx_ip_sa {
>> +       u32 ipaddr[4];
> 
> Same thing here.

Yep

> 
>> +       u32 ref_cnt;
>> +       bool used;
>> +};
>> +
>> +struct tx_sa {
>> +       struct xfrm_state *xs;
>> +       u32 key[4];
>> +       u32 salt;
>> +       bool encrypt;
>> +       bool used;
>> +};
>> +
>> +struct ixgbe_ipsec_tx_data {
>> +       u32 flags;
>> +       u16 trailer_len;
>> +       u16 sa_idx;
>> +};
>> +
>> +struct ixgbe_ipsec {
>> +       u16 num_rx_sa;
>> +       u16 num_tx_sa;
>> +       struct rx_ip_sa *ip_tbl;
>> +       struct rx_sa *rx_tbl;
>> +       struct tx_sa *tx_tbl;
>> +       DECLARE_HASHTABLE(rx_sa_list, 8);
> 
> The hash table seems a bit on the small side. You might look at
> increasing this to something like 32 in order to try and cut down on
> the load in each bucket since the upper limit is 1K or so isn't it?

I probably can - didn't want to use too much memory real estate if not 
really needed.  I did a little perf timing with a mostly full table and 
didn't see much degradation.  Maybe boost it to 16?

sln

> 
>> +};
>>   #endif /* _IXGBE_IPSEC_H_ */
>> --
>> 2.7.4
>>
>> _______________________________________________
>> Intel-wired-lan mailing list
>> Intel-wired-lan at osuosl.org
>> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [Intel-wired-lan] [next-queue 05/10] ixgbe: implement ipsec add and remove of offloaded SA
  2017-12-05 17:26     ` Alexander Duyck
@ 2017-12-07  5:43       ` Shannon Nelson
  -1 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-07  5:43 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: intel-wired-lan, Jeff Kirsher, Steffen Klassert,
	Sowmini Varadhan, Netdev

On 12/5/2017 9:26 AM, Alexander Duyck wrote:
> On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
> <shannon.nelson@oracle.com> wrote:
>> Add the functions for setting up and removing offloaded SAs (Security
>> Associations) with the x540 hardware.  We set up the callback structure
>> but we don't yet set the hardware feature bit to be sure the XFRM service
>> won't actually try to use us for an offload yet.
>>
>> The software tables are made up to mimic the hardware tables to make it
>> easier to track what's in the hardware, and the SA table index is used
>> for the XFRM offload handle.  However, there is a hashing field in the
>> Rx SA tracking that will be used to facilitate faster table searches in
>> the Rx fast path.
>>
>> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
>> ---
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 377 +++++++++++++++++++++++++
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  |   6 +
>>   2 files changed, 383 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> index 38a1a16..7b01d92 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> @@ -26,6 +26,8 @@
>>    ******************************************************************************/
>>
>>   #include "ixgbe.h"
>> +#include <net/xfrm.h>
>> +#include <crypto/aead.h>
>>
>>   /**
>>    * ixgbe_ipsec_set_tx_sa - set the Tx SA registers
>> @@ -128,6 +130,7 @@ static void ixgbe_ipsec_set_rx_ip(struct ixgbe_hw *hw, u16 idx, u32 addr[])
>>    **/
>>   void ixgbe_ipsec_clear_hw_tables(struct ixgbe_adapter *adapter)
>>   {
>> +       struct ixgbe_ipsec *ipsec = adapter->ipsec;
>>          struct ixgbe_hw *hw = &adapter->hw;
>>          u32 buf[4] = {0, 0, 0, 0};
>>          u16 idx;
>> @@ -139,9 +142,11 @@ void ixgbe_ipsec_clear_hw_tables(struct ixgbe_adapter *adapter)
>>          /* scrub the tables */
>>          for (idx = 0; idx < IXGBE_IPSEC_MAX_SA_COUNT; idx++)
>>                  ixgbe_ipsec_set_tx_sa(hw, idx, buf, 0);
>> +       ipsec->num_tx_sa = 0;
>>
>>          for (idx = 0; idx < IXGBE_IPSEC_MAX_SA_COUNT; idx++)
>>                  ixgbe_ipsec_set_rx_sa(hw, idx, 0, buf, 0, 0, 0);
>> +       ipsec->num_rx_sa = 0;
>>
>>          for (idx = 0; idx < IXGBE_IPSEC_MAX_RX_IP_COUNT; idx++)
>>                  ixgbe_ipsec_set_rx_ip(hw, idx, buf);
>> @@ -287,11 +292,383 @@ static void ixgbe_ipsec_start_engine(struct ixgbe_adapter *adapter)
>>   }
>>
>>   /**
>> + * ixgbe_ipsec_find_empty_idx - find the first unused security parameter index
>> + * @ipsec: pointer to ipsec struct
>> + * @rxtable: true if we need to look in the Rx table
>> + *
>> + * Returns the first unused index in either the Rx or Tx SA table
>> + **/
>> +static int ixgbe_ipsec_find_empty_idx(struct ixgbe_ipsec *ipsec, bool rxtable)
>> +{
>> +       u32 i;
>> +
>> +       if (rxtable) {
>> +               if (ipsec->num_rx_sa == IXGBE_IPSEC_MAX_SA_COUNT)
>> +                       return -ENOSPC;
>> +
>> +               /* search rx sa table */
>> +               for (i = 0; i < IXGBE_IPSEC_MAX_SA_COUNT; i++) {
>> +                       if (!ipsec->rx_tbl[i].used)
>> +                               return i;
>> +               }
>> +       } else {
>> +               if (ipsec->num_rx_sa == IXGBE_IPSEC_MAX_SA_COUNT)
>> +                       return -ENOSPC;
> 
> Should this bi num_tx_sa?

Hmm - can you say cut-and-paste?  Will fix.

> 
>> +
>> +               /* search tx sa table */
>> +               for (i = 0; i < IXGBE_IPSEC_MAX_SA_COUNT; i++) {
>> +                       if (!ipsec->tx_tbl[i].used)
>> +                               return i;
>> +               }
>> +       }
>> +
>> +       return -ENOSPC;
>> +}
>> +
>> +/**
>> + * ixgbe_ipsec_parse_proto_keys - find the key and salt based on the protocol
>> + * @xs: pointer to xfrm_state struct
>> + * @mykey: pointer to key array to populate
>> + * @mysalt: pointer to salt value to populate
>> + *
>> + * This copies the protocol keys and salt to our own data tables.  The
>> + * 82599 family only supports the one algorithm.
>> + **/
>> +static int ixgbe_ipsec_parse_proto_keys(struct xfrm_state *xs,
>> +                                       u32 *mykey, u32 *mysalt)
>> +{
>> +       struct net_device *dev = xs->xso.dev;
>> +       unsigned char *key_data;
>> +       char *alg_name = NULL;
>> +       char *aes_gcm_name = "rfc4106(gcm(aes))";
> 
> aes_gcm_name should probably be a static const char array instead of a pointer.

Sure.

> 
>> +       int key_len;
>> +
>> +       if (xs->aead) {
>> +               key_data = &xs->aead->alg_key[0];
>> +               key_len = xs->aead->alg_key_len;
>> +               alg_name = xs->aead->alg_name;
>> +       } else {
>> +               netdev_err(dev, "Unsupported IPsec algorithm\n");
>> +               return -EINVAL;
>> +       }
>> +
>> +       if (strcmp(alg_name, aes_gcm_name)) {
>> +               netdev_err(dev, "Unsupported IPsec algorithm - please use %s\n",
>> +                          aes_gcm_name);
>> +               return -EINVAL;
>> +       }
>> +
>> +       /* 160 accounts for 16 byte key and 4 byte salt */
>> +       if (key_len == 128) {
>> +               netdev_info(dev, "IPsec hw offload parameters missing 32 bit salt value\n");
>> +       } else if (key_len != 160) {
>> +               netdev_err(dev, "IPsec hw offload only supports keys up to 128 bits with a 32 bit salt\n");
>> +               return -EINVAL;
>> +       }
>> +
>> +       /* The key bytes come down in a bigendian array of bytes, and
>> +        * salt is always the last 4 bytes of the key array.
>> +        * We don't need to do any byteswapping.
>> +        */
>> +       memcpy(mykey, key_data, 16);
>> +       if (key_len == 160)
>> +               *mysalt = ((u32 *)key_data)[4];
>> +       else
>> +               *mysalt = 0;
> 
> You could combine these key_len checks into a single if/else set.
> Basically just do something like the following:

Alex, ever the reductionist :-)
Yep, makes sense.

> 
> /* 160 accounts for 16 byte key and 4 byte salt */
> if (key_len == 160) {
>           *mysalt = ((u32 *)key_data)[4];
> } else if (key_len != 128) {
>          netdev_err(dev, "IPsec hw offload only supports keys up to 128
> bits with a 32 bit salt\n");
>          return -EINVAL;
> } else {
>          netdev_info(dev, "IPsec hw offload parameters missing 32 bit
> salt value\n");
>          *mysalt = 0;
> }
> 
>   /* The key bytes come down in a bigendian array of bytes, and
>    * salt is always the last 4 bytes of the key array.
>    * We don't need to do any byteswapping.
>    */
> memcpy(mykey, key_data, 16);
> 
>> +
>> +       return 0;
>> +}
>> +
>> +/**
>> + * ixgbe_ipsec_add_sa - program device with a security association
>> + * @xs: pointer to transformer state struct
>> + **/
>> +static int ixgbe_ipsec_add_sa(struct xfrm_state *xs)
>> +{
>> +       struct net_device *dev = xs->xso.dev;
>> +       struct ixgbe_adapter *adapter = netdev_priv(dev);
>> +       struct ixgbe_ipsec *ipsec = adapter->ipsec;
>> +       struct ixgbe_hw *hw = &adapter->hw;
>> +       int checked, match, first;
>> +       u16 sa_idx;
>> +       int ret;
>> +       int i;
>> +
>> +       if (xs->id.proto != IPPROTO_ESP && xs->id.proto != IPPROTO_AH) {
>> +               netdev_err(dev, "Unsupported protocol 0x%04x for ipsec offload\n",
>> +                          xs->id.proto);
>> +               return -EINVAL;
>> +       }
>> +
>> +       if (xs->xso.flags & XFRM_OFFLOAD_INBOUND) {
>> +               struct rx_sa rsa;
>> +
>> +               if (xs->calg) {
>> +                       netdev_err(dev, "Compression offload not supported\n");
>> +                       return -EINVAL;
>> +               }
>> +
>> +               /* find the first unused index */
>> +               ret = ixgbe_ipsec_find_empty_idx(ipsec, true);
>> +               if (ret < 0) {
>> +                       netdev_err(dev, "No space for SA in Rx table!\n");
>> +                       return ret;
>> +               }
>> +               sa_idx = (u16)ret;
>> +
>> +               memset(&rsa, 0, sizeof(rsa));
>> +               rsa.used = true;
>> +               rsa.xs = xs;
>> +
>> +               if (rsa.xs->id.proto & IPPROTO_ESP)
>> +                       rsa.decrypt = xs->ealg || xs->aead;
>> +
>> +               /* get the key and salt */
>> +               ret = ixgbe_ipsec_parse_proto_keys(xs, rsa.key, &rsa.salt);
>> +               if (ret) {
>> +                       netdev_err(dev, "Failed to get key data for Rx SA table\n");
>> +                       return ret;
>> +               }
>> +
>> +               /* get ip for rx sa table */
>> +               if (xs->xso.flags & XFRM_OFFLOAD_IPV6)
>> +                       memcpy(rsa.ipaddr, &xs->id.daddr.a6, 16);
>> +               else
>> +                       memcpy(&rsa.ipaddr[3], &xs->id.daddr.a4, 4);
>> +
>> +               /* The HW does not have a 1:1 mapping from keys to IP addrs, so
>> +                * check for a matching IP addr entry in the table.  If the addr
>> +                * already exists, use it; else find an unused slot and add the
>> +                * addr.  If one does not exist and there are no unused table
>> +                * entries, fail the request.
>> +                */
>> +
>> +               /* Find an existing match or first not used, and stop looking
>> +                * after we've checked all we know we have.
>> +                */
>> +               checked = 0;
>> +               match = -1;
>> +               first = -1;
>> +               for (i = 0;
>> +                    i < IXGBE_IPSEC_MAX_RX_IP_COUNT &&
>> +                    (checked < ipsec->num_rx_sa || first < 0);
>> +                    i++) {
>> +                       if (ipsec->ip_tbl[i].used) {
>> +                               if (!memcmp(ipsec->ip_tbl[i].ipaddr,
>> +                                           rsa.ipaddr, sizeof(rsa.ipaddr))) {
>> +                                       match = i;
>> +                                       break;
>> +                               }
>> +                               checked++;
>> +                       } else if (first < 0) {
>> +                               first = i;  /* track the first empty seen */
>> +                       }
>> +               }
>> +
>> +               if (ipsec->num_rx_sa == 0)
>> +                       first = 0;
>> +
>> +               if (match >= 0) {
>> +                       /* addrs are the same, we should use this one */
>> +                       rsa.iptbl_ind = match;
>> +                       ipsec->ip_tbl[match].ref_cnt++;
>> +
>> +               } else if (first >= 0) {
>> +                       /* no matches, but here's an empty slot */
>> +                       rsa.iptbl_ind = first;
>> +
>> +                       memcpy(ipsec->ip_tbl[first].ipaddr,
>> +                              rsa.ipaddr, sizeof(rsa.ipaddr));
>> +                       ipsec->ip_tbl[first].ref_cnt = 1;
>> +                       ipsec->ip_tbl[first].used = true;
>> +
>> +                       ixgbe_ipsec_set_rx_ip(hw, rsa.iptbl_ind, rsa.ipaddr);
>> +
>> +               } else {
>> +                       /* no match and no empty slot */
>> +                       netdev_err(dev, "No space for SA in Rx IP SA table\n");
>> +                       memset(&rsa, 0, sizeof(rsa));
>> +                       return -ENOSPC;
>> +               }
>> +
>> +               rsa.mode = IXGBE_RXMOD_VALID;
>> +               if (rsa.xs->id.proto & IPPROTO_ESP)
>> +                       rsa.mode |= IXGBE_RXMOD_PROTO_ESP;
>> +               if (rsa.decrypt)
>> +                       rsa.mode |= IXGBE_RXMOD_DECRYPT;
>> +               if (rsa.xs->xso.flags & XFRM_OFFLOAD_IPV6)
>> +                       rsa.mode |= IXGBE_RXMOD_IPV6;
>> +
>> +               /* the preparations worked, so save the info */
>> +               memcpy(&ipsec->rx_tbl[sa_idx], &rsa, sizeof(rsa));
>> +
>> +               ixgbe_ipsec_set_rx_sa(hw, sa_idx, rsa.xs->id.spi, rsa.key,
>> +                                     rsa.salt, rsa.mode, rsa.iptbl_ind);
>> +               xs->xso.offload_handle = sa_idx + IXGBE_IPSEC_BASE_RX_INDEX;
>> +
>> +               ipsec->num_rx_sa++;
>> +
>> +               /* hash the new entry for faster search in Rx path */
>> +               hash_add_rcu(ipsec->rx_sa_list, &ipsec->rx_tbl[sa_idx].hlist,
>> +                            rsa.xs->id.spi);
>> +       } else {
>> +               struct tx_sa tsa;
>> +
>> +               /* find the first unused index */
>> +               ret = ixgbe_ipsec_find_empty_idx(ipsec, false);
>> +               if (ret < 0) {
>> +                       netdev_err(dev, "No space for SA in Tx table\n");
>> +                       return ret;
>> +               }
>> +               sa_idx = (u16)ret;
>> +
>> +               memset(&tsa, 0, sizeof(tsa));
>> +               tsa.used = true;
>> +               tsa.xs = xs;
>> +
>> +               if (xs->id.proto & IPPROTO_ESP)
>> +                       tsa.encrypt = xs->ealg || xs->aead;
>> +
>> +               ret = ixgbe_ipsec_parse_proto_keys(xs, tsa.key, &tsa.salt);
>> +               if (ret) {
>> +                       netdev_err(dev, "Failed to get key data for Tx SA table\n");
>> +                       memset(&tsa, 0, sizeof(tsa));
>> +                       return ret;
>> +               }
>> +
>> +               /* the preparations worked, so save the info */
>> +               memcpy(&ipsec->tx_tbl[sa_idx], &tsa, sizeof(tsa));
>> +
>> +               ixgbe_ipsec_set_tx_sa(hw, sa_idx, tsa.key, tsa.salt);
>> +
>> +               xs->xso.offload_handle = sa_idx + IXGBE_IPSEC_BASE_TX_INDEX;
>> +
>> +               ipsec->num_tx_sa++;
>> +       }
>> +
>> +       /* enable the engine if not already warmed up */
>> +       if (!(adapter->flags2 & IXGBE_FLAG2_IPSEC_ENABLED)) {
>> +               ixgbe_ipsec_start_engine(adapter);
>> +               adapter->flags2 |= IXGBE_FLAG2_IPSEC_ENABLED;
>> +       }
>> +
>> +       return 0;
>> +}
>> +
>> +/**
>> + * ixgbe_ipsec_del_sa - clear out this specific SA
>> + * @xs: pointer to transformer state struct
>> + **/
>> +static void ixgbe_ipsec_del_sa(struct xfrm_state *xs)
>> +{
>> +       struct net_device *dev = xs->xso.dev;
>> +       struct ixgbe_adapter *adapter = netdev_priv(dev);
>> +       struct ixgbe_ipsec *ipsec = adapter->ipsec;
>> +       struct ixgbe_hw *hw = &adapter->hw;
>> +       u32 zerobuf[4] = {0, 0, 0, 0};
>> +       u16 sa_idx;
>> +
>> +       if (xs->xso.flags & XFRM_OFFLOAD_INBOUND) {
>> +               struct rx_sa *rsa;
>> +               u8 ipi;
>> +
>> +               sa_idx = xs->xso.offload_handle - IXGBE_IPSEC_BASE_RX_INDEX;
>> +               rsa = &ipsec->rx_tbl[sa_idx];
>> +
>> +               if (!rsa->used) {
>> +                       netdev_err(dev, "Invalid Rx SA selected sa_idx=%d offload_handle=%lu\n",
>> +                                  sa_idx, xs->xso.offload_handle);
>> +                       return;
>> +               }
>> +
>> +               ixgbe_ipsec_set_rx_sa(hw, sa_idx, 0, zerobuf, 0, 0, 0);
>> +               hash_del_rcu(&rsa->hlist);
>> +
>> +               /* if the IP table entry is referenced by only this SA,
>> +                * i.e. ref_cnt is only 1, clear the IP table entry as well
>> +                */
>> +               ipi = rsa->iptbl_ind;
>> +               if (ipsec->ip_tbl[ipi].ref_cnt > 0) {
>> +                       ipsec->ip_tbl[ipi].ref_cnt--;
>> +
>> +                       if (!ipsec->ip_tbl[ipi].ref_cnt) {
>> +                               memset(&ipsec->ip_tbl[ipi], 0,
>> +                                      sizeof(struct rx_ip_sa));
>> +                               ixgbe_ipsec_set_rx_ip(hw, ipi, zerobuf);
>> +                       }
>> +               }
>> +
>> +               memset(rsa, 0, sizeof(struct rx_sa));
>> +               ipsec->num_rx_sa--;
>> +       } else {
>> +               sa_idx = xs->xso.offload_handle - IXGBE_IPSEC_BASE_TX_INDEX;
>> +
>> +               if (!ipsec->tx_tbl[sa_idx].used) {
>> +                       netdev_err(dev, "Invalid Tx SA selected sa_idx=%d offload_handle=%lu\n",
>> +                                  sa_idx, xs->xso.offload_handle);
>> +                       return;
>> +               }
>> +
>> +               ixgbe_ipsec_set_tx_sa(hw, sa_idx, zerobuf, 0);
>> +               memset(&ipsec->tx_tbl[sa_idx], 0, sizeof(struct tx_sa));
>> +               ipsec->num_tx_sa--;
>> +       }
>> +
>> +       /* if there are no SAs left, stop the engine to save energy */
>> +       if (ipsec->num_rx_sa == 0 && ipsec->num_tx_sa == 0) {
>> +               adapter->flags2 &= ~IXGBE_FLAG2_IPSEC_ENABLED;
>> +               ixgbe_ipsec_stop_engine(adapter);
>> +       }
>> +}
>> +
>> +static const struct xfrmdev_ops ixgbe_xfrmdev_ops = {
>> +       .xdo_dev_state_add = ixgbe_ipsec_add_sa,
>> +       .xdo_dev_state_delete = ixgbe_ipsec_del_sa,
>> +};
>> +
>> +/**
>>    * ixgbe_init_ipsec_offload - initialize security registers for IPSec operation
>>    * @adapter: board private structure
>>    **/
>>   void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter)
>>   {
>> +       struct ixgbe_ipsec *ipsec;
>> +       size_t size;
>> +
>> +       ipsec = kzalloc(sizeof(*ipsec), GFP_KERNEL);
>> +       if (!ipsec)
>> +               goto err;
> 
> I would say just add another label to skip over the if statement you
> added below.

Yep.

> 
>> +       hash_init(ipsec->rx_sa_list);
>> +
>> +       size = sizeof(struct rx_sa) * IXGBE_IPSEC_MAX_SA_COUNT;
>> +       ipsec->rx_tbl = kzalloc(size, GFP_KERNEL);
>> +       if (!ipsec->rx_tbl)
>> +               goto err;
>> +
>> +       size = sizeof(struct tx_sa) * IXGBE_IPSEC_MAX_SA_COUNT;
>> +       ipsec->tx_tbl = kzalloc(size, GFP_KERNEL);
>> +       if (!ipsec->tx_tbl)
>> +               goto err;
>> +
>> +       size = sizeof(struct rx_ip_sa) * IXGBE_IPSEC_MAX_RX_IP_COUNT;
>> +       ipsec->ip_tbl = kzalloc(size, GFP_KERNEL);
>> +       if (!ipsec->ip_tbl)
>> +               goto err;
> 
> Do all these tables need to be allocated separately? I'm just
> wondering if we can get away with doing something like what we did
> with the ixgbe_q_vector structure where you just allocate this as one
> physical block of memory and just split it up into multiple chunks
> with a separate pointer to each chunk. Doing that would cut down on
> the exception handling needed since it would be a single allocation
> failure you would have to deal with.

This may really just come down to style, and my thoughts around this are 
relatively trivial:
  - Is it nicer to the memory system to do one large alloc or a couple 
of smaller ones?
  - If any bounds-checking scans are done, this method would allow for 
checking, while I think the single large alloc wouldn't be as good for 
bounds checking between these tables.

> 
>> +       ipsec->num_rx_sa = 0;
>> +       ipsec->num_tx_sa = 0;
>> +
>> +       adapter->ipsec = ipsec;
>>          ixgbe_ipsec_clear_hw_tables(adapter);
>>          ixgbe_ipsec_stop_engine(adapter);

By the way, I hink I need to turn these two around and make sure the 
engine is stopped first.  It just seems right.

>> +
>> +       return;
>> +err:
>> +       if (ipsec) {
>> +               kfree(ipsec->ip_tbl);
>> +               kfree(ipsec->rx_tbl);
>> +               kfree(ipsec->tx_tbl);
>> +               kfree(adapter->ipsec);
>> +       }
>> +       netdev_err(adapter->netdev, "Unable to allocate memory for SA tables");
>>   }
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> index 51fb3cf..01fd89b 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> @@ -10542,6 +10542,12 @@ static void ixgbe_remove(struct pci_dev *pdev)
>>          set_bit(__IXGBE_REMOVING, &adapter->state);
>>          cancel_work_sync(&adapter->service_task);
>>
>> +#ifdef CONFIG_XFRM
>> +       kfree(adapter->ipsec->ip_tbl);
>> +       kfree(adapter->ipsec->rx_tbl);
>> +       kfree(adapter->ipsec->tx_tbl);
>> +       kfree(adapter->ipsec);
>> +#endif /* CONFIG_XFRM */
> 
> It might be useful if you were to move this into a function of its
> own. Also you should probably check for adapter->ipsec first,
> otherwise you are going to cause NULL pointer dereference any time
> adapter->ipsec isn't defined. because you are dereferencing it when
> you go to free each of those tables.

Yep

Thanks,
sln

> 
>>
>>   #ifdef CONFIG_IXGBE_DCA
>>          if (adapter->flags & IXGBE_FLAG_DCA_ENABLED) {
>> --
>> 2.7.4
>>
>> _______________________________________________
>> Intel-wired-lan mailing list
>> Intel-wired-lan@osuosl.org
>> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 05/10] ixgbe: implement ipsec add and remove of offloaded SA
@ 2017-12-07  5:43       ` Shannon Nelson
  0 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-07  5:43 UTC (permalink / raw)
  To: intel-wired-lan

On 12/5/2017 9:26 AM, Alexander Duyck wrote:
> On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
> <shannon.nelson@oracle.com> wrote:
>> Add the functions for setting up and removing offloaded SAs (Security
>> Associations) with the x540 hardware.  We set up the callback structure
>> but we don't yet set the hardware feature bit to be sure the XFRM service
>> won't actually try to use us for an offload yet.
>>
>> The software tables are made up to mimic the hardware tables to make it
>> easier to track what's in the hardware, and the SA table index is used
>> for the XFRM offload handle.  However, there is a hashing field in the
>> Rx SA tracking that will be used to facilitate faster table searches in
>> the Rx fast path.
>>
>> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
>> ---
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 377 +++++++++++++++++++++++++
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  |   6 +
>>   2 files changed, 383 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> index 38a1a16..7b01d92 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> @@ -26,6 +26,8 @@
>>    ******************************************************************************/
>>
>>   #include "ixgbe.h"
>> +#include <net/xfrm.h>
>> +#include <crypto/aead.h>
>>
>>   /**
>>    * ixgbe_ipsec_set_tx_sa - set the Tx SA registers
>> @@ -128,6 +130,7 @@ static void ixgbe_ipsec_set_rx_ip(struct ixgbe_hw *hw, u16 idx, u32 addr[])
>>    **/
>>   void ixgbe_ipsec_clear_hw_tables(struct ixgbe_adapter *adapter)
>>   {
>> +       struct ixgbe_ipsec *ipsec = adapter->ipsec;
>>          struct ixgbe_hw *hw = &adapter->hw;
>>          u32 buf[4] = {0, 0, 0, 0};
>>          u16 idx;
>> @@ -139,9 +142,11 @@ void ixgbe_ipsec_clear_hw_tables(struct ixgbe_adapter *adapter)
>>          /* scrub the tables */
>>          for (idx = 0; idx < IXGBE_IPSEC_MAX_SA_COUNT; idx++)
>>                  ixgbe_ipsec_set_tx_sa(hw, idx, buf, 0);
>> +       ipsec->num_tx_sa = 0;
>>
>>          for (idx = 0; idx < IXGBE_IPSEC_MAX_SA_COUNT; idx++)
>>                  ixgbe_ipsec_set_rx_sa(hw, idx, 0, buf, 0, 0, 0);
>> +       ipsec->num_rx_sa = 0;
>>
>>          for (idx = 0; idx < IXGBE_IPSEC_MAX_RX_IP_COUNT; idx++)
>>                  ixgbe_ipsec_set_rx_ip(hw, idx, buf);
>> @@ -287,11 +292,383 @@ static void ixgbe_ipsec_start_engine(struct ixgbe_adapter *adapter)
>>   }
>>
>>   /**
>> + * ixgbe_ipsec_find_empty_idx - find the first unused security parameter index
>> + * @ipsec: pointer to ipsec struct
>> + * @rxtable: true if we need to look in the Rx table
>> + *
>> + * Returns the first unused index in either the Rx or Tx SA table
>> + **/
>> +static int ixgbe_ipsec_find_empty_idx(struct ixgbe_ipsec *ipsec, bool rxtable)
>> +{
>> +       u32 i;
>> +
>> +       if (rxtable) {
>> +               if (ipsec->num_rx_sa == IXGBE_IPSEC_MAX_SA_COUNT)
>> +                       return -ENOSPC;
>> +
>> +               /* search rx sa table */
>> +               for (i = 0; i < IXGBE_IPSEC_MAX_SA_COUNT; i++) {
>> +                       if (!ipsec->rx_tbl[i].used)
>> +                               return i;
>> +               }
>> +       } else {
>> +               if (ipsec->num_rx_sa == IXGBE_IPSEC_MAX_SA_COUNT)
>> +                       return -ENOSPC;
> 
> Should this bi num_tx_sa?

Hmm - can you say cut-and-paste?  Will fix.

> 
>> +
>> +               /* search tx sa table */
>> +               for (i = 0; i < IXGBE_IPSEC_MAX_SA_COUNT; i++) {
>> +                       if (!ipsec->tx_tbl[i].used)
>> +                               return i;
>> +               }
>> +       }
>> +
>> +       return -ENOSPC;
>> +}
>> +
>> +/**
>> + * ixgbe_ipsec_parse_proto_keys - find the key and salt based on the protocol
>> + * @xs: pointer to xfrm_state struct
>> + * @mykey: pointer to key array to populate
>> + * @mysalt: pointer to salt value to populate
>> + *
>> + * This copies the protocol keys and salt to our own data tables.  The
>> + * 82599 family only supports the one algorithm.
>> + **/
>> +static int ixgbe_ipsec_parse_proto_keys(struct xfrm_state *xs,
>> +                                       u32 *mykey, u32 *mysalt)
>> +{
>> +       struct net_device *dev = xs->xso.dev;
>> +       unsigned char *key_data;
>> +       char *alg_name = NULL;
>> +       char *aes_gcm_name = "rfc4106(gcm(aes))";
> 
> aes_gcm_name should probably be a static const char array instead of a pointer.

Sure.

> 
>> +       int key_len;
>> +
>> +       if (xs->aead) {
>> +               key_data = &xs->aead->alg_key[0];
>> +               key_len = xs->aead->alg_key_len;
>> +               alg_name = xs->aead->alg_name;
>> +       } else {
>> +               netdev_err(dev, "Unsupported IPsec algorithm\n");
>> +               return -EINVAL;
>> +       }
>> +
>> +       if (strcmp(alg_name, aes_gcm_name)) {
>> +               netdev_err(dev, "Unsupported IPsec algorithm - please use %s\n",
>> +                          aes_gcm_name);
>> +               return -EINVAL;
>> +       }
>> +
>> +       /* 160 accounts for 16 byte key and 4 byte salt */
>> +       if (key_len == 128) {
>> +               netdev_info(dev, "IPsec hw offload parameters missing 32 bit salt value\n");
>> +       } else if (key_len != 160) {
>> +               netdev_err(dev, "IPsec hw offload only supports keys up to 128 bits with a 32 bit salt\n");
>> +               return -EINVAL;
>> +       }
>> +
>> +       /* The key bytes come down in a bigendian array of bytes, and
>> +        * salt is always the last 4 bytes of the key array.
>> +        * We don't need to do any byteswapping.
>> +        */
>> +       memcpy(mykey, key_data, 16);
>> +       if (key_len == 160)
>> +               *mysalt = ((u32 *)key_data)[4];
>> +       else
>> +               *mysalt = 0;
> 
> You could combine these key_len checks into a single if/else set.
> Basically just do something like the following:

Alex, ever the reductionist :-)
Yep, makes sense.

> 
> /* 160 accounts for 16 byte key and 4 byte salt */
> if (key_len == 160) {
>           *mysalt = ((u32 *)key_data)[4];
> } else if (key_len != 128) {
>          netdev_err(dev, "IPsec hw offload only supports keys up to 128
> bits with a 32 bit salt\n");
>          return -EINVAL;
> } else {
>          netdev_info(dev, "IPsec hw offload parameters missing 32 bit
> salt value\n");
>          *mysalt = 0;
> }
> 
>   /* The key bytes come down in a bigendian array of bytes, and
>    * salt is always the last 4 bytes of the key array.
>    * We don't need to do any byteswapping.
>    */
> memcpy(mykey, key_data, 16);
> 
>> +
>> +       return 0;
>> +}
>> +
>> +/**
>> + * ixgbe_ipsec_add_sa - program device with a security association
>> + * @xs: pointer to transformer state struct
>> + **/
>> +static int ixgbe_ipsec_add_sa(struct xfrm_state *xs)
>> +{
>> +       struct net_device *dev = xs->xso.dev;
>> +       struct ixgbe_adapter *adapter = netdev_priv(dev);
>> +       struct ixgbe_ipsec *ipsec = adapter->ipsec;
>> +       struct ixgbe_hw *hw = &adapter->hw;
>> +       int checked, match, first;
>> +       u16 sa_idx;
>> +       int ret;
>> +       int i;
>> +
>> +       if (xs->id.proto != IPPROTO_ESP && xs->id.proto != IPPROTO_AH) {
>> +               netdev_err(dev, "Unsupported protocol 0x%04x for ipsec offload\n",
>> +                          xs->id.proto);
>> +               return -EINVAL;
>> +       }
>> +
>> +       if (xs->xso.flags & XFRM_OFFLOAD_INBOUND) {
>> +               struct rx_sa rsa;
>> +
>> +               if (xs->calg) {
>> +                       netdev_err(dev, "Compression offload not supported\n");
>> +                       return -EINVAL;
>> +               }
>> +
>> +               /* find the first unused index */
>> +               ret = ixgbe_ipsec_find_empty_idx(ipsec, true);
>> +               if (ret < 0) {
>> +                       netdev_err(dev, "No space for SA in Rx table!\n");
>> +                       return ret;
>> +               }
>> +               sa_idx = (u16)ret;
>> +
>> +               memset(&rsa, 0, sizeof(rsa));
>> +               rsa.used = true;
>> +               rsa.xs = xs;
>> +
>> +               if (rsa.xs->id.proto & IPPROTO_ESP)
>> +                       rsa.decrypt = xs->ealg || xs->aead;
>> +
>> +               /* get the key and salt */
>> +               ret = ixgbe_ipsec_parse_proto_keys(xs, rsa.key, &rsa.salt);
>> +               if (ret) {
>> +                       netdev_err(dev, "Failed to get key data for Rx SA table\n");
>> +                       return ret;
>> +               }
>> +
>> +               /* get ip for rx sa table */
>> +               if (xs->xso.flags & XFRM_OFFLOAD_IPV6)
>> +                       memcpy(rsa.ipaddr, &xs->id.daddr.a6, 16);
>> +               else
>> +                       memcpy(&rsa.ipaddr[3], &xs->id.daddr.a4, 4);
>> +
>> +               /* The HW does not have a 1:1 mapping from keys to IP addrs, so
>> +                * check for a matching IP addr entry in the table.  If the addr
>> +                * already exists, use it; else find an unused slot and add the
>> +                * addr.  If one does not exist and there are no unused table
>> +                * entries, fail the request.
>> +                */
>> +
>> +               /* Find an existing match or first not used, and stop looking
>> +                * after we've checked all we know we have.
>> +                */
>> +               checked = 0;
>> +               match = -1;
>> +               first = -1;
>> +               for (i = 0;
>> +                    i < IXGBE_IPSEC_MAX_RX_IP_COUNT &&
>> +                    (checked < ipsec->num_rx_sa || first < 0);
>> +                    i++) {
>> +                       if (ipsec->ip_tbl[i].used) {
>> +                               if (!memcmp(ipsec->ip_tbl[i].ipaddr,
>> +                                           rsa.ipaddr, sizeof(rsa.ipaddr))) {
>> +                                       match = i;
>> +                                       break;
>> +                               }
>> +                               checked++;
>> +                       } else if (first < 0) {
>> +                               first = i;  /* track the first empty seen */
>> +                       }
>> +               }
>> +
>> +               if (ipsec->num_rx_sa == 0)
>> +                       first = 0;
>> +
>> +               if (match >= 0) {
>> +                       /* addrs are the same, we should use this one */
>> +                       rsa.iptbl_ind = match;
>> +                       ipsec->ip_tbl[match].ref_cnt++;
>> +
>> +               } else if (first >= 0) {
>> +                       /* no matches, but here's an empty slot */
>> +                       rsa.iptbl_ind = first;
>> +
>> +                       memcpy(ipsec->ip_tbl[first].ipaddr,
>> +                              rsa.ipaddr, sizeof(rsa.ipaddr));
>> +                       ipsec->ip_tbl[first].ref_cnt = 1;
>> +                       ipsec->ip_tbl[first].used = true;
>> +
>> +                       ixgbe_ipsec_set_rx_ip(hw, rsa.iptbl_ind, rsa.ipaddr);
>> +
>> +               } else {
>> +                       /* no match and no empty slot */
>> +                       netdev_err(dev, "No space for SA in Rx IP SA table\n");
>> +                       memset(&rsa, 0, sizeof(rsa));
>> +                       return -ENOSPC;
>> +               }
>> +
>> +               rsa.mode = IXGBE_RXMOD_VALID;
>> +               if (rsa.xs->id.proto & IPPROTO_ESP)
>> +                       rsa.mode |= IXGBE_RXMOD_PROTO_ESP;
>> +               if (rsa.decrypt)
>> +                       rsa.mode |= IXGBE_RXMOD_DECRYPT;
>> +               if (rsa.xs->xso.flags & XFRM_OFFLOAD_IPV6)
>> +                       rsa.mode |= IXGBE_RXMOD_IPV6;
>> +
>> +               /* the preparations worked, so save the info */
>> +               memcpy(&ipsec->rx_tbl[sa_idx], &rsa, sizeof(rsa));
>> +
>> +               ixgbe_ipsec_set_rx_sa(hw, sa_idx, rsa.xs->id.spi, rsa.key,
>> +                                     rsa.salt, rsa.mode, rsa.iptbl_ind);
>> +               xs->xso.offload_handle = sa_idx + IXGBE_IPSEC_BASE_RX_INDEX;
>> +
>> +               ipsec->num_rx_sa++;
>> +
>> +               /* hash the new entry for faster search in Rx path */
>> +               hash_add_rcu(ipsec->rx_sa_list, &ipsec->rx_tbl[sa_idx].hlist,
>> +                            rsa.xs->id.spi);
>> +       } else {
>> +               struct tx_sa tsa;
>> +
>> +               /* find the first unused index */
>> +               ret = ixgbe_ipsec_find_empty_idx(ipsec, false);
>> +               if (ret < 0) {
>> +                       netdev_err(dev, "No space for SA in Tx table\n");
>> +                       return ret;
>> +               }
>> +               sa_idx = (u16)ret;
>> +
>> +               memset(&tsa, 0, sizeof(tsa));
>> +               tsa.used = true;
>> +               tsa.xs = xs;
>> +
>> +               if (xs->id.proto & IPPROTO_ESP)
>> +                       tsa.encrypt = xs->ealg || xs->aead;
>> +
>> +               ret = ixgbe_ipsec_parse_proto_keys(xs, tsa.key, &tsa.salt);
>> +               if (ret) {
>> +                       netdev_err(dev, "Failed to get key data for Tx SA table\n");
>> +                       memset(&tsa, 0, sizeof(tsa));
>> +                       return ret;
>> +               }
>> +
>> +               /* the preparations worked, so save the info */
>> +               memcpy(&ipsec->tx_tbl[sa_idx], &tsa, sizeof(tsa));
>> +
>> +               ixgbe_ipsec_set_tx_sa(hw, sa_idx, tsa.key, tsa.salt);
>> +
>> +               xs->xso.offload_handle = sa_idx + IXGBE_IPSEC_BASE_TX_INDEX;
>> +
>> +               ipsec->num_tx_sa++;
>> +       }
>> +
>> +       /* enable the engine if not already warmed up */
>> +       if (!(adapter->flags2 & IXGBE_FLAG2_IPSEC_ENABLED)) {
>> +               ixgbe_ipsec_start_engine(adapter);
>> +               adapter->flags2 |= IXGBE_FLAG2_IPSEC_ENABLED;
>> +       }
>> +
>> +       return 0;
>> +}
>> +
>> +/**
>> + * ixgbe_ipsec_del_sa - clear out this specific SA
>> + * @xs: pointer to transformer state struct
>> + **/
>> +static void ixgbe_ipsec_del_sa(struct xfrm_state *xs)
>> +{
>> +       struct net_device *dev = xs->xso.dev;
>> +       struct ixgbe_adapter *adapter = netdev_priv(dev);
>> +       struct ixgbe_ipsec *ipsec = adapter->ipsec;
>> +       struct ixgbe_hw *hw = &adapter->hw;
>> +       u32 zerobuf[4] = {0, 0, 0, 0};
>> +       u16 sa_idx;
>> +
>> +       if (xs->xso.flags & XFRM_OFFLOAD_INBOUND) {
>> +               struct rx_sa *rsa;
>> +               u8 ipi;
>> +
>> +               sa_idx = xs->xso.offload_handle - IXGBE_IPSEC_BASE_RX_INDEX;
>> +               rsa = &ipsec->rx_tbl[sa_idx];
>> +
>> +               if (!rsa->used) {
>> +                       netdev_err(dev, "Invalid Rx SA selected sa_idx=%d offload_handle=%lu\n",
>> +                                  sa_idx, xs->xso.offload_handle);
>> +                       return;
>> +               }
>> +
>> +               ixgbe_ipsec_set_rx_sa(hw, sa_idx, 0, zerobuf, 0, 0, 0);
>> +               hash_del_rcu(&rsa->hlist);
>> +
>> +               /* if the IP table entry is referenced by only this SA,
>> +                * i.e. ref_cnt is only 1, clear the IP table entry as well
>> +                */
>> +               ipi = rsa->iptbl_ind;
>> +               if (ipsec->ip_tbl[ipi].ref_cnt > 0) {
>> +                       ipsec->ip_tbl[ipi].ref_cnt--;
>> +
>> +                       if (!ipsec->ip_tbl[ipi].ref_cnt) {
>> +                               memset(&ipsec->ip_tbl[ipi], 0,
>> +                                      sizeof(struct rx_ip_sa));
>> +                               ixgbe_ipsec_set_rx_ip(hw, ipi, zerobuf);
>> +                       }
>> +               }
>> +
>> +               memset(rsa, 0, sizeof(struct rx_sa));
>> +               ipsec->num_rx_sa--;
>> +       } else {
>> +               sa_idx = xs->xso.offload_handle - IXGBE_IPSEC_BASE_TX_INDEX;
>> +
>> +               if (!ipsec->tx_tbl[sa_idx].used) {
>> +                       netdev_err(dev, "Invalid Tx SA selected sa_idx=%d offload_handle=%lu\n",
>> +                                  sa_idx, xs->xso.offload_handle);
>> +                       return;
>> +               }
>> +
>> +               ixgbe_ipsec_set_tx_sa(hw, sa_idx, zerobuf, 0);
>> +               memset(&ipsec->tx_tbl[sa_idx], 0, sizeof(struct tx_sa));
>> +               ipsec->num_tx_sa--;
>> +       }
>> +
>> +       /* if there are no SAs left, stop the engine to save energy */
>> +       if (ipsec->num_rx_sa == 0 && ipsec->num_tx_sa == 0) {
>> +               adapter->flags2 &= ~IXGBE_FLAG2_IPSEC_ENABLED;
>> +               ixgbe_ipsec_stop_engine(adapter);
>> +       }
>> +}
>> +
>> +static const struct xfrmdev_ops ixgbe_xfrmdev_ops = {
>> +       .xdo_dev_state_add = ixgbe_ipsec_add_sa,
>> +       .xdo_dev_state_delete = ixgbe_ipsec_del_sa,
>> +};
>> +
>> +/**
>>    * ixgbe_init_ipsec_offload - initialize security registers for IPSec operation
>>    * @adapter: board private structure
>>    **/
>>   void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter)
>>   {
>> +       struct ixgbe_ipsec *ipsec;
>> +       size_t size;
>> +
>> +       ipsec = kzalloc(sizeof(*ipsec), GFP_KERNEL);
>> +       if (!ipsec)
>> +               goto err;
> 
> I would say just add another label to skip over the if statement you
> added below.

Yep.

> 
>> +       hash_init(ipsec->rx_sa_list);
>> +
>> +       size = sizeof(struct rx_sa) * IXGBE_IPSEC_MAX_SA_COUNT;
>> +       ipsec->rx_tbl = kzalloc(size, GFP_KERNEL);
>> +       if (!ipsec->rx_tbl)
>> +               goto err;
>> +
>> +       size = sizeof(struct tx_sa) * IXGBE_IPSEC_MAX_SA_COUNT;
>> +       ipsec->tx_tbl = kzalloc(size, GFP_KERNEL);
>> +       if (!ipsec->tx_tbl)
>> +               goto err;
>> +
>> +       size = sizeof(struct rx_ip_sa) * IXGBE_IPSEC_MAX_RX_IP_COUNT;
>> +       ipsec->ip_tbl = kzalloc(size, GFP_KERNEL);
>> +       if (!ipsec->ip_tbl)
>> +               goto err;
> 
> Do all these tables need to be allocated separately? I'm just
> wondering if we can get away with doing something like what we did
> with the ixgbe_q_vector structure where you just allocate this as one
> physical block of memory and just split it up into multiple chunks
> with a separate pointer to each chunk. Doing that would cut down on
> the exception handling needed since it would be a single allocation
> failure you would have to deal with.

This may really just come down to style, and my thoughts around this are 
relatively trivial:
  - Is it nicer to the memory system to do one large alloc or a couple 
of smaller ones?
  - If any bounds-checking scans are done, this method would allow for 
checking, while I think the single large alloc wouldn't be as good for 
bounds checking between these tables.

> 
>> +       ipsec->num_rx_sa = 0;
>> +       ipsec->num_tx_sa = 0;
>> +
>> +       adapter->ipsec = ipsec;
>>          ixgbe_ipsec_clear_hw_tables(adapter);
>>          ixgbe_ipsec_stop_engine(adapter);

By the way, I hink I need to turn these two around and make sure the 
engine is stopped first.  It just seems right.

>> +
>> +       return;
>> +err:
>> +       if (ipsec) {
>> +               kfree(ipsec->ip_tbl);
>> +               kfree(ipsec->rx_tbl);
>> +               kfree(ipsec->tx_tbl);
>> +               kfree(adapter->ipsec);
>> +       }
>> +       netdev_err(adapter->netdev, "Unable to allocate memory for SA tables");
>>   }
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> index 51fb3cf..01fd89b 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> @@ -10542,6 +10542,12 @@ static void ixgbe_remove(struct pci_dev *pdev)
>>          set_bit(__IXGBE_REMOVING, &adapter->state);
>>          cancel_work_sync(&adapter->service_task);
>>
>> +#ifdef CONFIG_XFRM
>> +       kfree(adapter->ipsec->ip_tbl);
>> +       kfree(adapter->ipsec->rx_tbl);
>> +       kfree(adapter->ipsec->tx_tbl);
>> +       kfree(adapter->ipsec);
>> +#endif /* CONFIG_XFRM */
> 
> It might be useful if you were to move this into a function of its
> own. Also you should probably check for adapter->ipsec first,
> otherwise you are going to cause NULL pointer dereference any time
> adapter->ipsec isn't defined. because you are dereferencing it when
> you go to free each of those tables.

Yep

Thanks,
sln

> 
>>
>>   #ifdef CONFIG_IXGBE_DCA
>>          if (adapter->flags & IXGBE_FLAG_DCA_ENABLED) {
>> --
>> 2.7.4
>>
>> _______________________________________________
>> Intel-wired-lan mailing list
>> Intel-wired-lan at osuosl.org
>> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [Intel-wired-lan] [next-queue 06/10] ixgbe: restore offloaded SAs after a reset
  2017-12-05 17:30     ` Alexander Duyck
@ 2017-12-07  5:43       ` Shannon Nelson
  -1 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-07  5:43 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: intel-wired-lan, Jeff Kirsher, Steffen Klassert,
	Sowmini Varadhan, Netdev

On 12/5/2017 9:30 AM, Alexander Duyck wrote:
> On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
> <shannon.nelson@oracle.com> wrote:
>> On a chip reset most of the table contents are lost, so must be
>> restored.  This scans the driver's ipsec tables and restores both
>> the filled and empty table slots to their pre-reset values.
>>
>> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
>> ---
>>   drivers/net/ethernet/intel/ixgbe/ixgbe.h       |  2 +
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 53 ++++++++++++++++++++++++++
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  |  1 +
>>   3 files changed, 56 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> index 9487750..7e8bca7 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> @@ -1009,7 +1009,9 @@ s32 ixgbe_negotiate_fc(struct ixgbe_hw *hw, u32 adv_reg, u32 lp_reg,
>>                         u32 adv_sym, u32 adv_asm, u32 lp_sym, u32 lp_asm);
>>   #ifdef CONFIG_XFRM_OFFLOAD
>>   void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter);
>> +void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter);
>>   #else
>>   static inline void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter) { };
>> +static inline void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter) { };
>>   #endif /* CONFIG_XFRM_OFFLOAD */
>>   #endif /* _IXGBE_H_ */
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> index 7b01d92..b93ee7f 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> @@ -292,6 +292,59 @@ static void ixgbe_ipsec_start_engine(struct ixgbe_adapter *adapter)
>>   }
>>
>>   /**
>> + * ixgbe_ipsec_restore - restore the ipsec HW settings after a reset
>> + * @adapter: board private structure
>> + **/
>> +void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter)
>> +{
>> +       struct ixgbe_ipsec *ipsec = adapter->ipsec;
>> +       struct ixgbe_hw *hw = &adapter->hw;
>> +       u32 zbuf[4] = {0, 0, 0, 0};
> 
> zbuf should be a static const.

Yep

> 
>> +       int i;
>> +
>> +       if (!(adapter->flags2 & IXGBE_FLAG2_IPSEC_ENABLED))
>> +               return;
>> +
>> +       /* clean up the engine settings */
>> +       ixgbe_ipsec_stop_engine(adapter);
>> +
>> +       /* start the engine */
>> +       ixgbe_ipsec_start_engine(adapter);
>> +
>> +       /* reload the IP addrs */
>> +       for (i = 0; i < IXGBE_IPSEC_MAX_RX_IP_COUNT; i++) {
>> +               struct rx_ip_sa *ipsa = &ipsec->ip_tbl[i];
>> +
>> +               if (ipsa->used)
>> +                       ixgbe_ipsec_set_rx_ip(hw, i, ipsa->ipaddr);
>> +               else
>> +                       ixgbe_ipsec_set_rx_ip(hw, i, zbuf);
> 
> If we are doing a restore do we actually need to write the zero
> values? If we did a reset I thought you had a function that was going
> through and zeroing everything out. If so this now becomes redundant.

Currently ixgbe_ipsec_clear_hw_tables() only gets run at at probe.  It 
should probably get run at remove as well.  Doing this is a bit of 
safety paranoia, and making sure the CAM memory structures that don't 
get cleared on reset have exactly what I expect in them.

> 
>> +       }
>> +
>> +       /* reload the Rx keys */
>> +       for (i = 0; i < IXGBE_IPSEC_MAX_SA_COUNT; i++) {
>> +               struct rx_sa *rsa = &ipsec->rx_tbl[i];
>> +
>> +               if (rsa->used)
>> +                       ixgbe_ipsec_set_rx_sa(hw, i, rsa->xs->id.spi,
>> +                                             rsa->key, rsa->salt,
>> +                                             rsa->mode, rsa->iptbl_ind);
>> +               else
>> +                       ixgbe_ipsec_set_rx_sa(hw, i, 0, zbuf, 0, 0, 0);
> 
> same here
> 
>> +       }
>> +
>> +       /* reload the Tx keys */
>> +       for (i = 0; i < IXGBE_IPSEC_MAX_SA_COUNT; i++) {
>> +               struct tx_sa *tsa = &ipsec->tx_tbl[i];
>> +
>> +               if (tsa->used)
>> +                       ixgbe_ipsec_set_tx_sa(hw, i, tsa->key, tsa->salt);
>> +               else
>> +                       ixgbe_ipsec_set_tx_sa(hw, i, zbuf, 0);
> 
> and here
> 
>> +       }
>> +}
>> +
>> +/**
>>    * ixgbe_ipsec_find_empty_idx - find the first unused security parameter index
>>    * @ipsec: pointer to ipsec struct
>>    * @rxtable: true if we need to look in the Rx table
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> index 01fd89b..6eabf92 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> @@ -5347,6 +5347,7 @@ static void ixgbe_configure(struct ixgbe_adapter *adapter)
>>
>>          ixgbe_set_rx_mode(adapter->netdev);
>>          ixgbe_restore_vlan(adapter);
>> +       ixgbe_ipsec_restore(adapter);
>>
>>          switch (hw->mac.type) {
>>          case ixgbe_mac_82599EB:
>> --
>> 2.7.4
>>
>> _______________________________________________
>> Intel-wired-lan mailing list
>> Intel-wired-lan@osuosl.org
>> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 06/10] ixgbe: restore offloaded SAs after a reset
@ 2017-12-07  5:43       ` Shannon Nelson
  0 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-07  5:43 UTC (permalink / raw)
  To: intel-wired-lan

On 12/5/2017 9:30 AM, Alexander Duyck wrote:
> On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
> <shannon.nelson@oracle.com> wrote:
>> On a chip reset most of the table contents are lost, so must be
>> restored.  This scans the driver's ipsec tables and restores both
>> the filled and empty table slots to their pre-reset values.
>>
>> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
>> ---
>>   drivers/net/ethernet/intel/ixgbe/ixgbe.h       |  2 +
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 53 ++++++++++++++++++++++++++
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  |  1 +
>>   3 files changed, 56 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> index 9487750..7e8bca7 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> @@ -1009,7 +1009,9 @@ s32 ixgbe_negotiate_fc(struct ixgbe_hw *hw, u32 adv_reg, u32 lp_reg,
>>                         u32 adv_sym, u32 adv_asm, u32 lp_sym, u32 lp_asm);
>>   #ifdef CONFIG_XFRM_OFFLOAD
>>   void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter);
>> +void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter);
>>   #else
>>   static inline void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter) { };
>> +static inline void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter) { };
>>   #endif /* CONFIG_XFRM_OFFLOAD */
>>   #endif /* _IXGBE_H_ */
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> index 7b01d92..b93ee7f 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> @@ -292,6 +292,59 @@ static void ixgbe_ipsec_start_engine(struct ixgbe_adapter *adapter)
>>   }
>>
>>   /**
>> + * ixgbe_ipsec_restore - restore the ipsec HW settings after a reset
>> + * @adapter: board private structure
>> + **/
>> +void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter)
>> +{
>> +       struct ixgbe_ipsec *ipsec = adapter->ipsec;
>> +       struct ixgbe_hw *hw = &adapter->hw;
>> +       u32 zbuf[4] = {0, 0, 0, 0};
> 
> zbuf should be a static const.

Yep

> 
>> +       int i;
>> +
>> +       if (!(adapter->flags2 & IXGBE_FLAG2_IPSEC_ENABLED))
>> +               return;
>> +
>> +       /* clean up the engine settings */
>> +       ixgbe_ipsec_stop_engine(adapter);
>> +
>> +       /* start the engine */
>> +       ixgbe_ipsec_start_engine(adapter);
>> +
>> +       /* reload the IP addrs */
>> +       for (i = 0; i < IXGBE_IPSEC_MAX_RX_IP_COUNT; i++) {
>> +               struct rx_ip_sa *ipsa = &ipsec->ip_tbl[i];
>> +
>> +               if (ipsa->used)
>> +                       ixgbe_ipsec_set_rx_ip(hw, i, ipsa->ipaddr);
>> +               else
>> +                       ixgbe_ipsec_set_rx_ip(hw, i, zbuf);
> 
> If we are doing a restore do we actually need to write the zero
> values? If we did a reset I thought you had a function that was going
> through and zeroing everything out. If so this now becomes redundant.

Currently ixgbe_ipsec_clear_hw_tables() only gets run at at probe.  It 
should probably get run at remove as well.  Doing this is a bit of 
safety paranoia, and making sure the CAM memory structures that don't 
get cleared on reset have exactly what I expect in them.

> 
>> +       }
>> +
>> +       /* reload the Rx keys */
>> +       for (i = 0; i < IXGBE_IPSEC_MAX_SA_COUNT; i++) {
>> +               struct rx_sa *rsa = &ipsec->rx_tbl[i];
>> +
>> +               if (rsa->used)
>> +                       ixgbe_ipsec_set_rx_sa(hw, i, rsa->xs->id.spi,
>> +                                             rsa->key, rsa->salt,
>> +                                             rsa->mode, rsa->iptbl_ind);
>> +               else
>> +                       ixgbe_ipsec_set_rx_sa(hw, i, 0, zbuf, 0, 0, 0);
> 
> same here
> 
>> +       }
>> +
>> +       /* reload the Tx keys */
>> +       for (i = 0; i < IXGBE_IPSEC_MAX_SA_COUNT; i++) {
>> +               struct tx_sa *tsa = &ipsec->tx_tbl[i];
>> +
>> +               if (tsa->used)
>> +                       ixgbe_ipsec_set_tx_sa(hw, i, tsa->key, tsa->salt);
>> +               else
>> +                       ixgbe_ipsec_set_tx_sa(hw, i, zbuf, 0);
> 
> and here
> 
>> +       }
>> +}
>> +
>> +/**
>>    * ixgbe_ipsec_find_empty_idx - find the first unused security parameter index
>>    * @ipsec: pointer to ipsec struct
>>    * @rxtable: true if we need to look in the Rx table
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> index 01fd89b..6eabf92 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> @@ -5347,6 +5347,7 @@ static void ixgbe_configure(struct ixgbe_adapter *adapter)
>>
>>          ixgbe_set_rx_mode(adapter->netdev);
>>          ixgbe_restore_vlan(adapter);
>> +       ixgbe_ipsec_restore(adapter);
>>
>>          switch (hw->mac.type) {
>>          case ixgbe_mac_82599EB:
>> --
>> 2.7.4
>>
>> _______________________________________________
>> Intel-wired-lan mailing list
>> Intel-wired-lan at osuosl.org
>> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [Intel-wired-lan] [next-queue 07/10] ixgbe: process the Rx ipsec offload
  2017-12-05 17:40     ` Alexander Duyck
@ 2017-12-07  5:43       ` Shannon Nelson
  -1 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-07  5:43 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: intel-wired-lan, Jeff Kirsher, Steffen Klassert,
	Sowmini Varadhan, Netdev

On 12/5/2017 9:40 AM, Alexander Duyck wrote:
> On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
> <shannon.nelson@oracle.com> wrote:
>> If the chip sees and decrypts an ipsec offload, set up the skb
>> sp pointer with the ralated SA info.  Since the chip is rude
>> enough to keep to itself the table index it used for the
>> decryption, we have to do our own table lookup, using the
>> hash for speed.
>>
>> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
>> ---
>>   drivers/net/ethernet/intel/ixgbe/ixgbe.h       |  6 ++
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 89 ++++++++++++++++++++++++++
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  |  3 +
>>   3 files changed, 98 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> index 7e8bca7..77f07dc 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> @@ -1009,9 +1009,15 @@ s32 ixgbe_negotiate_fc(struct ixgbe_hw *hw, u32 adv_reg, u32 lp_reg,
>>                         u32 adv_sym, u32 adv_asm, u32 lp_sym, u32 lp_asm);
>>   #ifdef CONFIG_XFRM_OFFLOAD
>>   void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter);
>> +void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
>> +                   union ixgbe_adv_rx_desc *rx_desc,
>> +                   struct sk_buff *skb);
>>   void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter);
>>   #else
>>   static inline void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter) { };
>> +static inline void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
>> +                                 union ixgbe_adv_rx_desc *rx_desc,
>> +                                 struct sk_buff *skb) { };
>>   static inline void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter) { };
>>   #endif /* CONFIG_XFRM_OFFLOAD */
>>   #endif /* _IXGBE_H_ */
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> index b93ee7f..fd06d9b 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> @@ -379,6 +379,35 @@ static int ixgbe_ipsec_find_empty_idx(struct ixgbe_ipsec *ipsec, bool rxtable)
>>   }
>>
>>   /**
>> + * ixgbe_ipsec_find_rx_state - find the state that matches
>> + * @ipsec: pointer to ipsec struct
>> + * @daddr: inbound address to match
>> + * @proto: protocol to match
>> + * @spi: SPI to match
>> + *
>> + * Returns a pointer to the matching SA state information
>> + **/
>> +static struct xfrm_state *ixgbe_ipsec_find_rx_state(struct ixgbe_ipsec *ipsec,
>> +                                                   __be32 daddr, u8 proto,
>> +                                                   __be32 spi)
>> +{
>> +       struct rx_sa *rsa;
>> +       struct xfrm_state *ret = NULL;
>> +
>> +       rcu_read_lock();
>> +       hash_for_each_possible_rcu(ipsec->rx_sa_list, rsa, hlist, spi)
>> +               if (spi == rsa->xs->id.spi &&
>> +                   daddr == rsa->xs->id.daddr.a4 &&
>> +                   proto == rsa->xs->id.proto) {
>> +                       ret = rsa->xs;
>> +                       xfrm_state_hold(ret);
>> +                       break;
>> +               }
>> +       rcu_read_unlock();
>> +       return ret;
>> +}
>> +
> 
> You need to choose a bucket, not just walk through all buckets.

I may be wrong, but I believe that is what is happening here, where the 
spi is the hash key.  As the function description says "iterate over all 
possible objects hashing to the same bucket".  Besides, I basically 
cribbed this directly from our Mellanox friends (thanks!).

> Otherwise you might as well have just used a linked list. You might
> look at using something like jhash_3words to generate a hash which you
> then use to choose the bucket.
> 
>> +/**
>>    * ixgbe_ipsec_parse_proto_keys - find the key and salt based on the protocol
>>    * @xs: pointer to xfrm_state struct
>>    * @mykey: pointer to key array to populate
>> @@ -680,6 +709,66 @@ static const struct xfrmdev_ops ixgbe_xfrmdev_ops = {
>>   };
>>
>>   /**
>> + * ixgbe_ipsec_rx - decode ipsec bits from Rx descriptor
>> + * @rx_ring: receiving ring
>> + * @rx_desc: receive data descriptor
>> + * @skb: current data packet
>> + *
>> + * Determine if there was an ipsec encapsulation noticed, and if so set up
>> + * the resulting status for later in the receive stack.
>> + **/
>> +void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
>> +                   union ixgbe_adv_rx_desc *rx_desc,
>> +                   struct sk_buff *skb)
>> +{
>> +       struct ixgbe_adapter *adapter = netdev_priv(rx_ring->netdev);
>> +       u16 pkt_info = le16_to_cpu(rx_desc->wb.lower.lo_dword.hs_rss.pkt_info);
>> +       u16 ipsec_pkt_types = IXGBE_RXDADV_PKTTYPE_IPSEC_AH |
>> +                               IXGBE_RXDADV_PKTTYPE_IPSEC_ESP;
>> +       struct ixgbe_ipsec *ipsec = adapter->ipsec;
>> +       struct xfrm_offload *xo = NULL;
>> +       struct xfrm_state *xs = NULL;
>> +       struct iphdr *iph;
>> +       u8 *c_hdr;
>> +       __be32 spi;
>> +       u8 proto;
>> +
>> +       /* we can assume no vlan header in the way, b/c the
>> +        * hw won't recognize the IPsec packet and anyway the
>> +        * currently vlan device doesn't support xfrm offload.
>> +        */
>> +       /* TODO: not supporting IPv6 yet */
>> +       iph = (struct iphdr *)(skb->data + ETH_HLEN);
>> +       c_hdr = (u8 *)iph + iph->ihl * 4;
>> +       switch (pkt_info & ipsec_pkt_types) {
>> +       case IXGBE_RXDADV_PKTTYPE_IPSEC_AH:
>> +               spi = ((struct ip_auth_hdr *)c_hdr)->spi;
>> +               proto = IPPROTO_AH;
>> +               break;
>> +       case IXGBE_RXDADV_PKTTYPE_IPSEC_ESP:
>> +               spi = ((struct ip_esp_hdr *)c_hdr)->spi;
>> +               proto = IPPROTO_ESP;
>> +               break;
>> +       default:
>> +               return;
>> +       }
>> +
>> +       xs = ixgbe_ipsec_find_rx_state(ipsec, iph->daddr, proto, spi);
>> +       if (unlikely(!xs))
>> +               return;
>> +
>> +       skb->sp = secpath_dup(skb->sp);
>> +       if (unlikely(!skb->sp))
>> +               return;
>> +
>> +       skb->sp->xvec[skb->sp->len++] = xs;
>> +       skb->sp->olen++;
>> +       xo = xfrm_offload(skb);
>> +       xo->flags = CRYPTO_DONE;
>> +       xo->status = CRYPTO_SUCCESS;
>> +}
>> +
>> +/**
>>    * ixgbe_init_ipsec_offload - initialize security registers for IPSec operation
>>    * @adapter: board private structure
>>    **/
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> index 6eabf92..60f9f2d 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> @@ -1755,6 +1755,9 @@ static void ixgbe_process_skb_fields(struct ixgbe_ring *rx_ring,
>>
>>          skb_record_rx_queue(skb, rx_ring->queue_index);
>>
>> +       if (ixgbe_test_staterr(rx_desc, IXGBE_RXDADV_STAT_SECP))
>> +               ixgbe_ipsec_rx(rx_ring, rx_desc, skb);
>> +
>>          skb->protocol = eth_type_trans(skb, dev);
>>   }
>>
>> --
>> 2.7.4
>>
>> _______________________________________________
>> Intel-wired-lan mailing list
>> Intel-wired-lan@osuosl.org
>> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 07/10] ixgbe: process the Rx ipsec offload
@ 2017-12-07  5:43       ` Shannon Nelson
  0 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-07  5:43 UTC (permalink / raw)
  To: intel-wired-lan

On 12/5/2017 9:40 AM, Alexander Duyck wrote:
> On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
> <shannon.nelson@oracle.com> wrote:
>> If the chip sees and decrypts an ipsec offload, set up the skb
>> sp pointer with the ralated SA info.  Since the chip is rude
>> enough to keep to itself the table index it used for the
>> decryption, we have to do our own table lookup, using the
>> hash for speed.
>>
>> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
>> ---
>>   drivers/net/ethernet/intel/ixgbe/ixgbe.h       |  6 ++
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 89 ++++++++++++++++++++++++++
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  |  3 +
>>   3 files changed, 98 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> index 7e8bca7..77f07dc 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> @@ -1009,9 +1009,15 @@ s32 ixgbe_negotiate_fc(struct ixgbe_hw *hw, u32 adv_reg, u32 lp_reg,
>>                         u32 adv_sym, u32 adv_asm, u32 lp_sym, u32 lp_asm);
>>   #ifdef CONFIG_XFRM_OFFLOAD
>>   void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter);
>> +void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
>> +                   union ixgbe_adv_rx_desc *rx_desc,
>> +                   struct sk_buff *skb);
>>   void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter);
>>   #else
>>   static inline void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter) { };
>> +static inline void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
>> +                                 union ixgbe_adv_rx_desc *rx_desc,
>> +                                 struct sk_buff *skb) { };
>>   static inline void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter) { };
>>   #endif /* CONFIG_XFRM_OFFLOAD */
>>   #endif /* _IXGBE_H_ */
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> index b93ee7f..fd06d9b 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> @@ -379,6 +379,35 @@ static int ixgbe_ipsec_find_empty_idx(struct ixgbe_ipsec *ipsec, bool rxtable)
>>   }
>>
>>   /**
>> + * ixgbe_ipsec_find_rx_state - find the state that matches
>> + * @ipsec: pointer to ipsec struct
>> + * @daddr: inbound address to match
>> + * @proto: protocol to match
>> + * @spi: SPI to match
>> + *
>> + * Returns a pointer to the matching SA state information
>> + **/
>> +static struct xfrm_state *ixgbe_ipsec_find_rx_state(struct ixgbe_ipsec *ipsec,
>> +                                                   __be32 daddr, u8 proto,
>> +                                                   __be32 spi)
>> +{
>> +       struct rx_sa *rsa;
>> +       struct xfrm_state *ret = NULL;
>> +
>> +       rcu_read_lock();
>> +       hash_for_each_possible_rcu(ipsec->rx_sa_list, rsa, hlist, spi)
>> +               if (spi == rsa->xs->id.spi &&
>> +                   daddr == rsa->xs->id.daddr.a4 &&
>> +                   proto == rsa->xs->id.proto) {
>> +                       ret = rsa->xs;
>> +                       xfrm_state_hold(ret);
>> +                       break;
>> +               }
>> +       rcu_read_unlock();
>> +       return ret;
>> +}
>> +
> 
> You need to choose a bucket, not just walk through all buckets.

I may be wrong, but I believe that is what is happening here, where the 
spi is the hash key.  As the function description says "iterate over all 
possible objects hashing to the same bucket".  Besides, I basically 
cribbed this directly from our Mellanox friends (thanks!).

> Otherwise you might as well have just used a linked list. You might
> look at using something like jhash_3words to generate a hash which you
> then use to choose the bucket.
> 
>> +/**
>>    * ixgbe_ipsec_parse_proto_keys - find the key and salt based on the protocol
>>    * @xs: pointer to xfrm_state struct
>>    * @mykey: pointer to key array to populate
>> @@ -680,6 +709,66 @@ static const struct xfrmdev_ops ixgbe_xfrmdev_ops = {
>>   };
>>
>>   /**
>> + * ixgbe_ipsec_rx - decode ipsec bits from Rx descriptor
>> + * @rx_ring: receiving ring
>> + * @rx_desc: receive data descriptor
>> + * @skb: current data packet
>> + *
>> + * Determine if there was an ipsec encapsulation noticed, and if so set up
>> + * the resulting status for later in the receive stack.
>> + **/
>> +void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
>> +                   union ixgbe_adv_rx_desc *rx_desc,
>> +                   struct sk_buff *skb)
>> +{
>> +       struct ixgbe_adapter *adapter = netdev_priv(rx_ring->netdev);
>> +       u16 pkt_info = le16_to_cpu(rx_desc->wb.lower.lo_dword.hs_rss.pkt_info);
>> +       u16 ipsec_pkt_types = IXGBE_RXDADV_PKTTYPE_IPSEC_AH |
>> +                               IXGBE_RXDADV_PKTTYPE_IPSEC_ESP;
>> +       struct ixgbe_ipsec *ipsec = adapter->ipsec;
>> +       struct xfrm_offload *xo = NULL;
>> +       struct xfrm_state *xs = NULL;
>> +       struct iphdr *iph;
>> +       u8 *c_hdr;
>> +       __be32 spi;
>> +       u8 proto;
>> +
>> +       /* we can assume no vlan header in the way, b/c the
>> +        * hw won't recognize the IPsec packet and anyway the
>> +        * currently vlan device doesn't support xfrm offload.
>> +        */
>> +       /* TODO: not supporting IPv6 yet */
>> +       iph = (struct iphdr *)(skb->data + ETH_HLEN);
>> +       c_hdr = (u8 *)iph + iph->ihl * 4;
>> +       switch (pkt_info & ipsec_pkt_types) {
>> +       case IXGBE_RXDADV_PKTTYPE_IPSEC_AH:
>> +               spi = ((struct ip_auth_hdr *)c_hdr)->spi;
>> +               proto = IPPROTO_AH;
>> +               break;
>> +       case IXGBE_RXDADV_PKTTYPE_IPSEC_ESP:
>> +               spi = ((struct ip_esp_hdr *)c_hdr)->spi;
>> +               proto = IPPROTO_ESP;
>> +               break;
>> +       default:
>> +               return;
>> +       }
>> +
>> +       xs = ixgbe_ipsec_find_rx_state(ipsec, iph->daddr, proto, spi);
>> +       if (unlikely(!xs))
>> +               return;
>> +
>> +       skb->sp = secpath_dup(skb->sp);
>> +       if (unlikely(!skb->sp))
>> +               return;
>> +
>> +       skb->sp->xvec[skb->sp->len++] = xs;
>> +       skb->sp->olen++;
>> +       xo = xfrm_offload(skb);
>> +       xo->flags = CRYPTO_DONE;
>> +       xo->status = CRYPTO_SUCCESS;
>> +}
>> +
>> +/**
>>    * ixgbe_init_ipsec_offload - initialize security registers for IPSec operation
>>    * @adapter: board private structure
>>    **/
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> index 6eabf92..60f9f2d 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> @@ -1755,6 +1755,9 @@ static void ixgbe_process_skb_fields(struct ixgbe_ring *rx_ring,
>>
>>          skb_record_rx_queue(skb, rx_ring->queue_index);
>>
>> +       if (ixgbe_test_staterr(rx_desc, IXGBE_RXDADV_STAT_SECP))
>> +               ixgbe_ipsec_rx(rx_ring, rx_desc, skb);
>> +
>>          skb->protocol = eth_type_trans(skb, dev);
>>   }
>>
>> --
>> 2.7.4
>>
>> _______________________________________________
>> Intel-wired-lan mailing list
>> Intel-wired-lan at osuosl.org
>> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [Intel-wired-lan] [next-queue 08/10] ixgbe: process the Tx ipsec offload
  2017-12-05 18:13     ` Alexander Duyck
@ 2017-12-07  5:43       ` Shannon Nelson
  -1 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-07  5:43 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: intel-wired-lan, Jeff Kirsher, Steffen Klassert,
	Sowmini Varadhan, Netdev

On 12/5/2017 10:13 AM, Alexander Duyck wrote:
> On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
> <shannon.nelson@oracle.com> wrote:
>> If the skb has a security association referenced in the skb, then
>> set up the Tx descriptor with the ipsec offload bits.  While we're
>> here, we fix an oddly named field in the context descriptor struct.
>>
>> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
>> ---
>>   drivers/net/ethernet/intel/ixgbe/ixgbe.h       | 10 +++-
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 77 ++++++++++++++++++++++++++
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c   |  4 +-
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  | 38 ++++++++++---
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_type.h  |  2 +-
>>   5 files changed, 118 insertions(+), 13 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> index 77f07dc..68097fe 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> @@ -171,10 +171,11 @@ enum ixgbe_tx_flags {
>>          IXGBE_TX_FLAGS_CC       = 0x08,
>>          IXGBE_TX_FLAGS_IPV4     = 0x10,
>>          IXGBE_TX_FLAGS_CSUM     = 0x20,
>> +       IXGBE_TX_FLAGS_IPSEC    = 0x40,
>>
>>          /* software defined flags */
>> -       IXGBE_TX_FLAGS_SW_VLAN  = 0x40,
>> -       IXGBE_TX_FLAGS_FCOE     = 0x80,
>> +       IXGBE_TX_FLAGS_SW_VLAN  = 0x80,
>> +       IXGBE_TX_FLAGS_FCOE     = 0x100,
>>   };
>>
>>   /* VLAN info */
>> @@ -1012,12 +1013,17 @@ void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter);
>>   void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
>>                      union ixgbe_adv_rx_desc *rx_desc,
>>                      struct sk_buff *skb);
>> +int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring, struct sk_buff *skb,
>> +                  __be16 protocol, struct ixgbe_ipsec_tx_data *itd);
>>   void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter);
>>   #else
>>   static inline void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter) { };
>>   static inline void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
>>                                    union ixgbe_adv_rx_desc *rx_desc,
>>                                    struct sk_buff *skb) { };
>> +static inline int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring,
>> +                                struct sk_buff *skb, __be16 protocol,
>> +                                struct ixgbe_ipsec_tx_data *itd) { return 0; };
>>   static inline void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter) { };
>>   #endif /* CONFIG_XFRM_OFFLOAD */
>>   #endif /* _IXGBE_H_ */
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> index fd06d9b..2a0dd7a 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> @@ -703,12 +703,89 @@ static void ixgbe_ipsec_del_sa(struct xfrm_state *xs)
>>          }
>>   }
>>
>> +/**
>> + * ixgbe_ipsec_offload_ok - can this packet use the xfrm hw offload
>> + * @skb: current data packet
>> + * @xs: pointer to transformer state struct
>> + **/
>> +static bool ixgbe_ipsec_offload_ok(struct sk_buff *skb, struct xfrm_state *xs)
>> +{
>> +       if (xs->props.family == AF_INET) {
>> +               /* Offload with IPv4 options is not supported yet */
>> +               if (ip_hdr(skb)->ihl > 5)
> 
> I would make this ihl != 5 instead of "> 5" since smaller values would
> be invalid as well.

Sure

> 
>> +                       return false;
>> +       } else {
>> +               /* Offload with IPv6 extension headers is not support yet */
>> +               if (ipv6_ext_hdr(ipv6_hdr(skb)->nexthdr))
>> +                       return false;
>> +       }
>> +
>> +       return true;
>> +}
>> +
>>   static const struct xfrmdev_ops ixgbe_xfrmdev_ops = {
>>          .xdo_dev_state_add = ixgbe_ipsec_add_sa,
>>          .xdo_dev_state_delete = ixgbe_ipsec_del_sa,
>> +       .xdo_dev_offload_ok = ixgbe_ipsec_offload_ok,
>>   };
>>
>>   /**
>> + * ixgbe_ipsec_tx - setup Tx flags for ipsec offload
>> + * @tx_ring: outgoing context
>> + * @skb: current data packet
>> + * @protocol: network protocol
>> + * @itd: ipsec Tx data for later use in building context descriptor
>> + **/
>> +int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring, struct sk_buff *skb,
>> +                  __be16 protocol, struct ixgbe_ipsec_tx_data *itd)
>> +{
>> +       struct ixgbe_adapter *adapter = netdev_priv(tx_ring->netdev);
>> +       struct ixgbe_ipsec *ipsec = adapter->ipsec;
>> +       struct xfrm_state *xs;
>> +       struct tx_sa *tsa;
>> +
>> +       if (!skb->sp->len) {
>> +               netdev_err(tx_ring->netdev, "%s: no xfrm state len = %d\n",
>> +                          __func__, skb->sp->len);
>> +               return 0;
>> +       }
>> +
>> +       xs = xfrm_input_state(skb);
>> +       if (!xs) {
>> +               netdev_err(tx_ring->netdev, "%s: no xfrm_input_state() xs = %p\n",
>> +                          __func__, xs);
>> +               return 0;
>> +       }
>> +
>> +       itd->sa_idx = xs->xso.offload_handle - IXGBE_IPSEC_BASE_TX_INDEX;
>> +       if (itd->sa_idx > IXGBE_IPSEC_MAX_SA_COUNT) {
>> +               netdev_err(tx_ring->netdev, "%s: bad sa_idx=%d handle=%lu\n",
>> +                          __func__, itd->sa_idx, xs->xso.offload_handle);
>> +               return 0;
>> +       }
>> +
>> +       tsa = &ipsec->tx_tbl[itd->sa_idx];
>> +       if (!tsa->used) {
>> +               netdev_err(tx_ring->netdev, "%s: unused sa_idx=%d\n",
>> +                          __func__, itd->sa_idx);
>> +               return 0;
>> +       }
>> +
>> +       itd->flags = 0;
>> +       if (xs->id.proto == IPPROTO_ESP) {
>> +               itd->flags |= IXGBE_ADVTXD_TUCMD_IPSEC_TYPE_ESP |
>> +                             IXGBE_ADVTXD_TUCMD_L4T_TCP;
> 
> Why is the TCP value being set here? This doesn't seem correct either.
> This implies TCP a TCP offload. It seems like this should only be
> setting ESP.

Honestly?  Because when I was testing that, it didn't work without it. 
This was one of the things I was going to come back to when I started 
working on the csum and tso support.

> 
>> +               if (protocol == htons(ETH_P_IP))
>> +                       itd->flags |= IXGBE_ADVTXD_TUCMD_IPV4;
> 
> Does the IPsec offload need to know if the frame is v4 or v6? I'm just
> wondering if it does or not. 

Yes, I believe this is how it knows how much header to skip to find the 
ESP header.  However, I'll test that and see if it can come out.

> If not then this probably isn't needed.
> One thought on this line is you might look at moving it into
> ixgbe_tx_csum. If setting the bit is harmless without setting IXSM we
> might look at moving it into the end of ixgbe_tx_csum and just make it
> compare against first->protocol there.

> 
>> +               itd->trailer_len = xs->props.trailer_len;
>> +       }
>> +       if (tsa->encrypt)
>> +               itd->flags |= IXGBE_ADVTXD_TUCMD_IPSEC_ENCRYPT_EN;
>> +
>> +       return 1;
>> +}
>> +
>> +/**
>>    * ixgbe_ipsec_rx - decode ipsec bits from Rx descriptor
>>    * @rx_ring: receiving ring
>>    * @rx_desc: receive data descriptor
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
>> index f1bfae0..d7875b3 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
>> @@ -1261,7 +1261,7 @@ void ixgbe_clear_interrupt_scheme(struct ixgbe_adapter *adapter)
>>   }
>>
>>   void ixgbe_tx_ctxtdesc(struct ixgbe_ring *tx_ring, u32 vlan_macip_lens,
>> -                      u32 fcoe_sof_eof, u32 type_tucmd, u32 mss_l4len_idx)
>> +                      u32 fceof_saidx, u32 type_tucmd, u32 mss_l4len_idx)
>>   {
>>          struct ixgbe_adv_tx_context_desc *context_desc;
>>          u16 i = tx_ring->next_to_use;
>> @@ -1275,7 +1275,7 @@ void ixgbe_tx_ctxtdesc(struct ixgbe_ring *tx_ring, u32 vlan_macip_lens,
>>          type_tucmd |= IXGBE_TXD_CMD_DEXT | IXGBE_ADVTXD_DTYP_CTXT;
>>
>>          context_desc->vlan_macip_lens   = cpu_to_le32(vlan_macip_lens);
>> -       context_desc->seqnum_seed       = cpu_to_le32(fcoe_sof_eof);
>> +       context_desc->fceof_saidx       = cpu_to_le32(fceof_saidx);
>>          context_desc->type_tucmd_mlhl   = cpu_to_le32(type_tucmd);
>>          context_desc->mss_l4len_idx     = cpu_to_le32(mss_l4len_idx);
>>   }
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> index 60f9f2d..c857594 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> @@ -7659,9 +7659,10 @@ static void ixgbe_service_task(struct work_struct *work)
>>
>>   static int ixgbe_tso(struct ixgbe_ring *tx_ring,
>>                       struct ixgbe_tx_buffer *first,
>> -                    u8 *hdr_len)
>> +                    u8 *hdr_len,
>> +                    struct ixgbe_ipsec_tx_data *itd)
>>   {
>> -       u32 vlan_macip_lens, type_tucmd, mss_l4len_idx;
>> +       u32 vlan_macip_lens, type_tucmd, mss_l4len_idx, fceof_saidx = 0;
>>          struct sk_buff *skb = first->skb;
>>          union {
>>                  struct iphdr *v4;
>> @@ -7740,7 +7741,12 @@ static int ixgbe_tso(struct ixgbe_ring *tx_ring,
>>          vlan_macip_lens |= (ip.hdr - skb->data) << IXGBE_ADVTXD_MACLEN_SHIFT;
>>          vlan_macip_lens |= first->tx_flags & IXGBE_TX_FLAGS_VLAN_MASK;
>>
>> -       ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, 0, type_tucmd,
>> +       if (first->tx_flags & IXGBE_TX_FLAGS_IPSEC) {
>> +               fceof_saidx |= itd->sa_idx;
>> +               type_tucmd |= itd->flags | itd->trailer_len;
>> +       }
>> +
>> +       ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, fceof_saidx, type_tucmd,
>>                            mss_l4len_idx);
>>
>>          return 1;
>> @@ -7756,10 +7762,12 @@ static inline bool ixgbe_ipv6_csum_is_sctp(struct sk_buff *skb)
>>   }
>>
>>   static void ixgbe_tx_csum(struct ixgbe_ring *tx_ring,
>> -                         struct ixgbe_tx_buffer *first)
>> +                         struct ixgbe_tx_buffer *first,
>> +                         struct ixgbe_ipsec_tx_data *itd)
>>   {
>>          struct sk_buff *skb = first->skb;
>>          u32 vlan_macip_lens = 0;
>> +       u32 fceof_saidx = 0;
>>          u32 type_tucmd = 0;
>>
>>          if (skb->ip_summed != CHECKSUM_PARTIAL) {
>> @@ -7800,7 +7808,12 @@ static void ixgbe_tx_csum(struct ixgbe_ring *tx_ring,
>>          vlan_macip_lens |= skb_network_offset(skb) << IXGBE_ADVTXD_MACLEN_SHIFT;
>>          vlan_macip_lens |= first->tx_flags & IXGBE_TX_FLAGS_VLAN_MASK;
>>
>> -       ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, 0, type_tucmd, 0);
>> +       if (first->tx_flags & IXGBE_TX_FLAGS_IPSEC) {
>> +               fceof_saidx |= itd->sa_idx;
>> +               type_tucmd |= itd->flags | itd->trailer_len;
>> +       }
>> +
>> +       ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, fceof_saidx, type_tucmd, 0);
>>   }
>>
>>   #define IXGBE_SET_FLAG(_input, _flag, _result) \
>> @@ -7843,11 +7856,16 @@ static void ixgbe_tx_olinfo_status(union ixgbe_adv_tx_desc *tx_desc,
>>                                          IXGBE_TX_FLAGS_CSUM,
>>                                          IXGBE_ADVTXD_POPTS_TXSM);
>>
>> -       /* enble IPv4 checksum for TSO */
>> +       /* enable IPv4 checksum for TSO */
>>          olinfo_status |= IXGBE_SET_FLAG(tx_flags,
>>                                          IXGBE_TX_FLAGS_IPV4,
>>                                          IXGBE_ADVTXD_POPTS_IXSM);
>>
>> +       /* enable IPsec */
>> +       olinfo_status |= IXGBE_SET_FLAG(tx_flags,
>> +                                       IXGBE_TX_FLAGS_IPSEC,
>> +                                       IXGBE_ADVTXD_POPTS_IPSEC);
>> +
>>          /*
>>           * Check Context must be set if Tx switch is enabled, which it
>>           * always is for case where virtual functions are running
>> @@ -8306,6 +8324,7 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff *skb,
>>          u32 tx_flags = 0;
>>          unsigned short f;
>>          u16 count = TXD_USE_COUNT(skb_headlen(skb));
>> +       struct ixgbe_ipsec_tx_data ipsec_tx = { 0 };
>>          __be16 protocol = skb->protocol;
>>          u8 hdr_len = 0;
>>
>> @@ -8394,6 +8413,9 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff *skb,
>>                  }
>>          }
>>
>> +       if (skb->sp && ixgbe_ipsec_tx(tx_ring, skb, protocol, &ipsec_tx))
>> +               tx_flags |= IXGBE_TX_FLAGS_IPSEC | IXGBE_TX_FLAGS_CC;
> 
> You might just want to pull the skb->sp check into ixgbe_ipsec_tx and
> could pass tx_flags as a part of the first buffer. It doesn't really
> matter anyway as most of this will just be inlined so it will all end
> up a part of the same function anyway.

Since the function is defined in a different .o file, are you sure it 
will get inlined?  I put the skb->sp check here to make sure we don't do 
an unnecessary jump.

> 
> Also I would move this down so that it is handled after the fields in
> the first buffer_info structure are set. Then this can ll just fall
> inline with the TSO block and get handled there.
> 
>> +
>>          /* record initial flags and protocol */
>>          first->tx_flags = tx_flags;
>>          first->protocol = protocol;
>> @@ -8410,11 +8432,11 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff *skb,
>>          }
>>
>>   #endif /* IXGBE_FCOE */
> 
> So if you move the function down here it will help to avoid any other
> complication. In addition you could follow the same logic that we do
> for ixgbe_tso/fso so you could drop the frame instead of transmitting
> it if it is requesting a bad offload.

Sure

sln

> 
>> -       tso = ixgbe_tso(tx_ring, first, &hdr_len);
>> +       tso = ixgbe_tso(tx_ring, first, &hdr_len, &ipsec_tx);
>>          if (tso < 0)
>>                  goto out_drop;
>>          else if (!tso)
>> -               ixgbe_tx_csum(tx_ring, first);
>> +               ixgbe_tx_csum(tx_ring, first, &ipsec_tx);
>>
>>          /* add the ATR filter if ATR is on */
>>          if (test_bit(__IXGBE_TX_FDIR_INIT_DONE, &tx_ring->state))
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
>> index 3df0763..0ac725fa 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
>> @@ -2856,7 +2856,7 @@ union ixgbe_adv_rx_desc {
>>   /* Context descriptors */
>>   struct ixgbe_adv_tx_context_desc {
>>          __le32 vlan_macip_lens;
>> -       __le32 seqnum_seed;
>> +       __le32 fceof_saidx;
>>          __le32 type_tucmd_mlhl;
>>          __le32 mss_l4len_idx;
>>   };
>> --
>> 2.7.4
>>
>> _______________________________________________
>> Intel-wired-lan mailing list
>> Intel-wired-lan@osuosl.org
>> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 08/10] ixgbe: process the Tx ipsec offload
@ 2017-12-07  5:43       ` Shannon Nelson
  0 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-07  5:43 UTC (permalink / raw)
  To: intel-wired-lan

On 12/5/2017 10:13 AM, Alexander Duyck wrote:
> On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
> <shannon.nelson@oracle.com> wrote:
>> If the skb has a security association referenced in the skb, then
>> set up the Tx descriptor with the ipsec offload bits.  While we're
>> here, we fix an oddly named field in the context descriptor struct.
>>
>> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
>> ---
>>   drivers/net/ethernet/intel/ixgbe/ixgbe.h       | 10 +++-
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 77 ++++++++++++++++++++++++++
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c   |  4 +-
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  | 38 ++++++++++---
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_type.h  |  2 +-
>>   5 files changed, 118 insertions(+), 13 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> index 77f07dc..68097fe 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> @@ -171,10 +171,11 @@ enum ixgbe_tx_flags {
>>          IXGBE_TX_FLAGS_CC       = 0x08,
>>          IXGBE_TX_FLAGS_IPV4     = 0x10,
>>          IXGBE_TX_FLAGS_CSUM     = 0x20,
>> +       IXGBE_TX_FLAGS_IPSEC    = 0x40,
>>
>>          /* software defined flags */
>> -       IXGBE_TX_FLAGS_SW_VLAN  = 0x40,
>> -       IXGBE_TX_FLAGS_FCOE     = 0x80,
>> +       IXGBE_TX_FLAGS_SW_VLAN  = 0x80,
>> +       IXGBE_TX_FLAGS_FCOE     = 0x100,
>>   };
>>
>>   /* VLAN info */
>> @@ -1012,12 +1013,17 @@ void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter);
>>   void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
>>                      union ixgbe_adv_rx_desc *rx_desc,
>>                      struct sk_buff *skb);
>> +int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring, struct sk_buff *skb,
>> +                  __be16 protocol, struct ixgbe_ipsec_tx_data *itd);
>>   void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter);
>>   #else
>>   static inline void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter) { };
>>   static inline void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
>>                                    union ixgbe_adv_rx_desc *rx_desc,
>>                                    struct sk_buff *skb) { };
>> +static inline int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring,
>> +                                struct sk_buff *skb, __be16 protocol,
>> +                                struct ixgbe_ipsec_tx_data *itd) { return 0; };
>>   static inline void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter) { };
>>   #endif /* CONFIG_XFRM_OFFLOAD */
>>   #endif /* _IXGBE_H_ */
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> index fd06d9b..2a0dd7a 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> @@ -703,12 +703,89 @@ static void ixgbe_ipsec_del_sa(struct xfrm_state *xs)
>>          }
>>   }
>>
>> +/**
>> + * ixgbe_ipsec_offload_ok - can this packet use the xfrm hw offload
>> + * @skb: current data packet
>> + * @xs: pointer to transformer state struct
>> + **/
>> +static bool ixgbe_ipsec_offload_ok(struct sk_buff *skb, struct xfrm_state *xs)
>> +{
>> +       if (xs->props.family == AF_INET) {
>> +               /* Offload with IPv4 options is not supported yet */
>> +               if (ip_hdr(skb)->ihl > 5)
> 
> I would make this ihl != 5 instead of "> 5" since smaller values would
> be invalid as well.

Sure

> 
>> +                       return false;
>> +       } else {
>> +               /* Offload with IPv6 extension headers is not support yet */
>> +               if (ipv6_ext_hdr(ipv6_hdr(skb)->nexthdr))
>> +                       return false;
>> +       }
>> +
>> +       return true;
>> +}
>> +
>>   static const struct xfrmdev_ops ixgbe_xfrmdev_ops = {
>>          .xdo_dev_state_add = ixgbe_ipsec_add_sa,
>>          .xdo_dev_state_delete = ixgbe_ipsec_del_sa,
>> +       .xdo_dev_offload_ok = ixgbe_ipsec_offload_ok,
>>   };
>>
>>   /**
>> + * ixgbe_ipsec_tx - setup Tx flags for ipsec offload
>> + * @tx_ring: outgoing context
>> + * @skb: current data packet
>> + * @protocol: network protocol
>> + * @itd: ipsec Tx data for later use in building context descriptor
>> + **/
>> +int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring, struct sk_buff *skb,
>> +                  __be16 protocol, struct ixgbe_ipsec_tx_data *itd)
>> +{
>> +       struct ixgbe_adapter *adapter = netdev_priv(tx_ring->netdev);
>> +       struct ixgbe_ipsec *ipsec = adapter->ipsec;
>> +       struct xfrm_state *xs;
>> +       struct tx_sa *tsa;
>> +
>> +       if (!skb->sp->len) {
>> +               netdev_err(tx_ring->netdev, "%s: no xfrm state len = %d\n",
>> +                          __func__, skb->sp->len);
>> +               return 0;
>> +       }
>> +
>> +       xs = xfrm_input_state(skb);
>> +       if (!xs) {
>> +               netdev_err(tx_ring->netdev, "%s: no xfrm_input_state() xs = %p\n",
>> +                          __func__, xs);
>> +               return 0;
>> +       }
>> +
>> +       itd->sa_idx = xs->xso.offload_handle - IXGBE_IPSEC_BASE_TX_INDEX;
>> +       if (itd->sa_idx > IXGBE_IPSEC_MAX_SA_COUNT) {
>> +               netdev_err(tx_ring->netdev, "%s: bad sa_idx=%d handle=%lu\n",
>> +                          __func__, itd->sa_idx, xs->xso.offload_handle);
>> +               return 0;
>> +       }
>> +
>> +       tsa = &ipsec->tx_tbl[itd->sa_idx];
>> +       if (!tsa->used) {
>> +               netdev_err(tx_ring->netdev, "%s: unused sa_idx=%d\n",
>> +                          __func__, itd->sa_idx);
>> +               return 0;
>> +       }
>> +
>> +       itd->flags = 0;
>> +       if (xs->id.proto == IPPROTO_ESP) {
>> +               itd->flags |= IXGBE_ADVTXD_TUCMD_IPSEC_TYPE_ESP |
>> +                             IXGBE_ADVTXD_TUCMD_L4T_TCP;
> 
> Why is the TCP value being set here? This doesn't seem correct either.
> This implies TCP a TCP offload. It seems like this should only be
> setting ESP.

Honestly?  Because when I was testing that, it didn't work without it. 
This was one of the things I was going to come back to when I started 
working on the csum and tso support.

> 
>> +               if (protocol == htons(ETH_P_IP))
>> +                       itd->flags |= IXGBE_ADVTXD_TUCMD_IPV4;
> 
> Does the IPsec offload need to know if the frame is v4 or v6? I'm just
> wondering if it does or not. 

Yes, I believe this is how it knows how much header to skip to find the 
ESP header.  However, I'll test that and see if it can come out.

> If not then this probably isn't needed.
> One thought on this line is you might look at moving it into
> ixgbe_tx_csum. If setting the bit is harmless without setting IXSM we
> might look at moving it into the end of ixgbe_tx_csum and just make it
> compare against first->protocol there.

> 
>> +               itd->trailer_len = xs->props.trailer_len;
>> +       }
>> +       if (tsa->encrypt)
>> +               itd->flags |= IXGBE_ADVTXD_TUCMD_IPSEC_ENCRYPT_EN;
>> +
>> +       return 1;
>> +}
>> +
>> +/**
>>    * ixgbe_ipsec_rx - decode ipsec bits from Rx descriptor
>>    * @rx_ring: receiving ring
>>    * @rx_desc: receive data descriptor
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
>> index f1bfae0..d7875b3 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
>> @@ -1261,7 +1261,7 @@ void ixgbe_clear_interrupt_scheme(struct ixgbe_adapter *adapter)
>>   }
>>
>>   void ixgbe_tx_ctxtdesc(struct ixgbe_ring *tx_ring, u32 vlan_macip_lens,
>> -                      u32 fcoe_sof_eof, u32 type_tucmd, u32 mss_l4len_idx)
>> +                      u32 fceof_saidx, u32 type_tucmd, u32 mss_l4len_idx)
>>   {
>>          struct ixgbe_adv_tx_context_desc *context_desc;
>>          u16 i = tx_ring->next_to_use;
>> @@ -1275,7 +1275,7 @@ void ixgbe_tx_ctxtdesc(struct ixgbe_ring *tx_ring, u32 vlan_macip_lens,
>>          type_tucmd |= IXGBE_TXD_CMD_DEXT | IXGBE_ADVTXD_DTYP_CTXT;
>>
>>          context_desc->vlan_macip_lens   = cpu_to_le32(vlan_macip_lens);
>> -       context_desc->seqnum_seed       = cpu_to_le32(fcoe_sof_eof);
>> +       context_desc->fceof_saidx       = cpu_to_le32(fceof_saidx);
>>          context_desc->type_tucmd_mlhl   = cpu_to_le32(type_tucmd);
>>          context_desc->mss_l4len_idx     = cpu_to_le32(mss_l4len_idx);
>>   }
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> index 60f9f2d..c857594 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> @@ -7659,9 +7659,10 @@ static void ixgbe_service_task(struct work_struct *work)
>>
>>   static int ixgbe_tso(struct ixgbe_ring *tx_ring,
>>                       struct ixgbe_tx_buffer *first,
>> -                    u8 *hdr_len)
>> +                    u8 *hdr_len,
>> +                    struct ixgbe_ipsec_tx_data *itd)
>>   {
>> -       u32 vlan_macip_lens, type_tucmd, mss_l4len_idx;
>> +       u32 vlan_macip_lens, type_tucmd, mss_l4len_idx, fceof_saidx = 0;
>>          struct sk_buff *skb = first->skb;
>>          union {
>>                  struct iphdr *v4;
>> @@ -7740,7 +7741,12 @@ static int ixgbe_tso(struct ixgbe_ring *tx_ring,
>>          vlan_macip_lens |= (ip.hdr - skb->data) << IXGBE_ADVTXD_MACLEN_SHIFT;
>>          vlan_macip_lens |= first->tx_flags & IXGBE_TX_FLAGS_VLAN_MASK;
>>
>> -       ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, 0, type_tucmd,
>> +       if (first->tx_flags & IXGBE_TX_FLAGS_IPSEC) {
>> +               fceof_saidx |= itd->sa_idx;
>> +               type_tucmd |= itd->flags | itd->trailer_len;
>> +       }
>> +
>> +       ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, fceof_saidx, type_tucmd,
>>                            mss_l4len_idx);
>>
>>          return 1;
>> @@ -7756,10 +7762,12 @@ static inline bool ixgbe_ipv6_csum_is_sctp(struct sk_buff *skb)
>>   }
>>
>>   static void ixgbe_tx_csum(struct ixgbe_ring *tx_ring,
>> -                         struct ixgbe_tx_buffer *first)
>> +                         struct ixgbe_tx_buffer *first,
>> +                         struct ixgbe_ipsec_tx_data *itd)
>>   {
>>          struct sk_buff *skb = first->skb;
>>          u32 vlan_macip_lens = 0;
>> +       u32 fceof_saidx = 0;
>>          u32 type_tucmd = 0;
>>
>>          if (skb->ip_summed != CHECKSUM_PARTIAL) {
>> @@ -7800,7 +7808,12 @@ static void ixgbe_tx_csum(struct ixgbe_ring *tx_ring,
>>          vlan_macip_lens |= skb_network_offset(skb) << IXGBE_ADVTXD_MACLEN_SHIFT;
>>          vlan_macip_lens |= first->tx_flags & IXGBE_TX_FLAGS_VLAN_MASK;
>>
>> -       ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, 0, type_tucmd, 0);
>> +       if (first->tx_flags & IXGBE_TX_FLAGS_IPSEC) {
>> +               fceof_saidx |= itd->sa_idx;
>> +               type_tucmd |= itd->flags | itd->trailer_len;
>> +       }
>> +
>> +       ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, fceof_saidx, type_tucmd, 0);
>>   }
>>
>>   #define IXGBE_SET_FLAG(_input, _flag, _result) \
>> @@ -7843,11 +7856,16 @@ static void ixgbe_tx_olinfo_status(union ixgbe_adv_tx_desc *tx_desc,
>>                                          IXGBE_TX_FLAGS_CSUM,
>>                                          IXGBE_ADVTXD_POPTS_TXSM);
>>
>> -       /* enble IPv4 checksum for TSO */
>> +       /* enable IPv4 checksum for TSO */
>>          olinfo_status |= IXGBE_SET_FLAG(tx_flags,
>>                                          IXGBE_TX_FLAGS_IPV4,
>>                                          IXGBE_ADVTXD_POPTS_IXSM);
>>
>> +       /* enable IPsec */
>> +       olinfo_status |= IXGBE_SET_FLAG(tx_flags,
>> +                                       IXGBE_TX_FLAGS_IPSEC,
>> +                                       IXGBE_ADVTXD_POPTS_IPSEC);
>> +
>>          /*
>>           * Check Context must be set if Tx switch is enabled, which it
>>           * always is for case where virtual functions are running
>> @@ -8306,6 +8324,7 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff *skb,
>>          u32 tx_flags = 0;
>>          unsigned short f;
>>          u16 count = TXD_USE_COUNT(skb_headlen(skb));
>> +       struct ixgbe_ipsec_tx_data ipsec_tx = { 0 };
>>          __be16 protocol = skb->protocol;
>>          u8 hdr_len = 0;
>>
>> @@ -8394,6 +8413,9 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff *skb,
>>                  }
>>          }
>>
>> +       if (skb->sp && ixgbe_ipsec_tx(tx_ring, skb, protocol, &ipsec_tx))
>> +               tx_flags |= IXGBE_TX_FLAGS_IPSEC | IXGBE_TX_FLAGS_CC;
> 
> You might just want to pull the skb->sp check into ixgbe_ipsec_tx and
> could pass tx_flags as a part of the first buffer. It doesn't really
> matter anyway as most of this will just be inlined so it will all end
> up a part of the same function anyway.

Since the function is defined in a different .o file, are you sure it 
will get inlined?  I put the skb->sp check here to make sure we don't do 
an unnecessary jump.

> 
> Also I would move this down so that it is handled after the fields in
> the first buffer_info structure are set. Then this can ll just fall
> inline with the TSO block and get handled there.
> 
>> +
>>          /* record initial flags and protocol */
>>          first->tx_flags = tx_flags;
>>          first->protocol = protocol;
>> @@ -8410,11 +8432,11 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff *skb,
>>          }
>>
>>   #endif /* IXGBE_FCOE */
> 
> So if you move the function down here it will help to avoid any other
> complication. In addition you could follow the same logic that we do
> for ixgbe_tso/fso so you could drop the frame instead of transmitting
> it if it is requesting a bad offload.

Sure

sln

> 
>> -       tso = ixgbe_tso(tx_ring, first, &hdr_len);
>> +       tso = ixgbe_tso(tx_ring, first, &hdr_len, &ipsec_tx);
>>          if (tso < 0)
>>                  goto out_drop;
>>          else if (!tso)
>> -               ixgbe_tx_csum(tx_ring, first);
>> +               ixgbe_tx_csum(tx_ring, first, &ipsec_tx);
>>
>>          /* add the ATR filter if ATR is on */
>>          if (test_bit(__IXGBE_TX_FDIR_INIT_DONE, &tx_ring->state))
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
>> index 3df0763..0ac725fa 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
>> @@ -2856,7 +2856,7 @@ union ixgbe_adv_rx_desc {
>>   /* Context descriptors */
>>   struct ixgbe_adv_tx_context_desc {
>>          __le32 vlan_macip_lens;
>> -       __le32 seqnum_seed;
>> +       __le32 fceof_saidx;
>>          __le32 type_tucmd_mlhl;
>>          __le32 mss_l4len_idx;
>>   };
>> --
>> 2.7.4
>>
>> _______________________________________________
>> Intel-wired-lan mailing list
>> Intel-wired-lan at osuosl.org
>> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [Intel-wired-lan] [next-queue 09/10] ixgbe: ipsec offload stats
  2017-12-05 19:53     ` Alexander Duyck
@ 2017-12-07  5:43       ` Shannon Nelson
  -1 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-07  5:43 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: intel-wired-lan, Jeff Kirsher, Steffen Klassert,
	Sowmini Varadhan, Netdev

On 12/5/2017 11:53 AM, Alexander Duyck wrote:
> On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
> <shannon.nelson@oracle.com> wrote:
>> Add a simple statistic to count the ipsec offloads.
>>
>> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
>> ---
>>   drivers/net/ethernet/intel/ixgbe/ixgbe.h         |  1 +
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c | 28 ++++++++++++++----------
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c   |  3 +++
>>   3 files changed, 20 insertions(+), 12 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> index 68097fe..bb66c85 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> @@ -265,6 +265,7 @@ struct ixgbe_rx_buffer {
>>   struct ixgbe_queue_stats {
>>          u64 packets;
>>          u64 bytes;
>> +       u64 ipsec_offloads;
>>   };
>>
>>   struct ixgbe_tx_queue_stats {
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
>> index c3e7a81..dddbc74 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
>> @@ -1233,34 +1233,34 @@ static void ixgbe_get_ethtool_stats(struct net_device *netdev,
>>          for (j = 0; j < netdev->num_tx_queues; j++) {
>>                  ring = adapter->tx_ring[j];
>>                  if (!ring) {
>> -                       data[i] = 0;
>> -                       data[i+1] = 0;
>> -                       i += 2;
>> +                       data[i++] = 0;
>> +                       data[i++] = 0;
>> +                       data[i++] = 0;
>>                          continue;
>>                  }
>>
>>                  do {
>>                          start = u64_stats_fetch_begin_irq(&ring->syncp);
>> -                       data[i]   = ring->stats.packets;
>> -                       data[i+1] = ring->stats.bytes;
>> +                       data[i++] = ring->stats.packets;
>> +                       data[i++] = ring->stats.bytes;
>> +                       data[i++] = ring->stats.ipsec_offloads;
>>                  } while (u64_stats_fetch_retry_irq(&ring->syncp, start));
>> -               i += 2;
>>          }
>>          for (j = 0; j < IXGBE_NUM_RX_QUEUES; j++) {
>>                  ring = adapter->rx_ring[j];
>>                  if (!ring) {
>> -                       data[i] = 0;
>> -                       data[i+1] = 0;
>> -                       i += 2;
>> +                       data[i++] = 0;
>> +                       data[i++] = 0;
>> +                       data[i++] = 0;
>>                          continue;
>>                  }
>>
>>                  do {
>>                          start = u64_stats_fetch_begin_irq(&ring->syncp);
>> -                       data[i]   = ring->stats.packets;
>> -                       data[i+1] = ring->stats.bytes;
>> +                       data[i++] = ring->stats.packets;
>> +                       data[i++] = ring->stats.bytes;
>> +                       data[i++] = ring->stats.ipsec_offloads;
>>                  } while (u64_stats_fetch_retry_irq(&ring->syncp, start));
>> -               i += 2;
>>          }
>>
>>          for (j = 0; j < IXGBE_MAX_PACKET_BUFFERS; j++) {
>> @@ -1297,12 +1297,16 @@ static void ixgbe_get_strings(struct net_device *netdev, u32 stringset,
>>                          p += ETH_GSTRING_LEN;
>>                          sprintf(p, "tx_queue_%u_bytes", i);
>>                          p += ETH_GSTRING_LEN;
>> +                       sprintf(p, "tx_queue_%u_ipsec_offloads", i);
>> +                       p += ETH_GSTRING_LEN;
>>                  }
>>                  for (i = 0; i < IXGBE_NUM_RX_QUEUES; i++) {
>>                          sprintf(p, "rx_queue_%u_packets", i);
>>                          p += ETH_GSTRING_LEN;
>>                          sprintf(p, "rx_queue_%u_bytes", i);
>>                          p += ETH_GSTRING_LEN;
>> +                       sprintf(p, "rx_queue_%u_ipsec_offloads", i);
>> +                       p += ETH_GSTRING_LEN;
>>                  }
>>                  for (i = 0; i < IXGBE_MAX_PACKET_BUFFERS; i++) {
>>                          sprintf(p, "tx_pb_%u_pxon", i);
> 
> I probably wouldn't bother reporting this per ring. It might make more
> sense to handle this as an adapter statistic.

I agree, it really messes up the output.  However, I like seeing it per 
ring while I'm testing.  I'll move it.

> 
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> index 2a0dd7a..d1220bf 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> @@ -782,6 +782,7 @@ int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring, struct sk_buff *skb,
>>          if (tsa->encrypt)
>>                  itd->flags |= IXGBE_ADVTXD_TUCMD_IPSEC_ENCRYPT_EN;
>>
>> +       tx_ring->stats.ipsec_offloads++;
>>          return 1;
> 
> Instead of doing this here you may want to make it a part of the Tx
> clean-up path. You should still have the flag bit set so you could
> test a test for the IPSEC flag bit and if it is set on the tx_buffer
> following the transmit you could then increment it there.

Is there a benefit to doing it elsewhere?  I'm assuming the answer has 
to do with fastpath cycles...

sln

> 
>>   }
>>
>> @@ -843,6 +844,8 @@ void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
>>          xo = xfrm_offload(skb);
>>          xo->flags = CRYPTO_DONE;
>>          xo->status = CRYPTO_SUCCESS;
>> +
>> +       rx_ring->stats.ipsec_offloads++;
>>   }
>>
>>   /**
>> --
>> 2.7.4
>>
>> _______________________________________________
>> Intel-wired-lan mailing list
>> Intel-wired-lan@osuosl.org
>> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 09/10] ixgbe: ipsec offload stats
@ 2017-12-07  5:43       ` Shannon Nelson
  0 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-07  5:43 UTC (permalink / raw)
  To: intel-wired-lan

On 12/5/2017 11:53 AM, Alexander Duyck wrote:
> On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
> <shannon.nelson@oracle.com> wrote:
>> Add a simple statistic to count the ipsec offloads.
>>
>> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
>> ---
>>   drivers/net/ethernet/intel/ixgbe/ixgbe.h         |  1 +
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c | 28 ++++++++++++++----------
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c   |  3 +++
>>   3 files changed, 20 insertions(+), 12 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> index 68097fe..bb66c85 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> @@ -265,6 +265,7 @@ struct ixgbe_rx_buffer {
>>   struct ixgbe_queue_stats {
>>          u64 packets;
>>          u64 bytes;
>> +       u64 ipsec_offloads;
>>   };
>>
>>   struct ixgbe_tx_queue_stats {
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
>> index c3e7a81..dddbc74 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
>> @@ -1233,34 +1233,34 @@ static void ixgbe_get_ethtool_stats(struct net_device *netdev,
>>          for (j = 0; j < netdev->num_tx_queues; j++) {
>>                  ring = adapter->tx_ring[j];
>>                  if (!ring) {
>> -                       data[i] = 0;
>> -                       data[i+1] = 0;
>> -                       i += 2;
>> +                       data[i++] = 0;
>> +                       data[i++] = 0;
>> +                       data[i++] = 0;
>>                          continue;
>>                  }
>>
>>                  do {
>>                          start = u64_stats_fetch_begin_irq(&ring->syncp);
>> -                       data[i]   = ring->stats.packets;
>> -                       data[i+1] = ring->stats.bytes;
>> +                       data[i++] = ring->stats.packets;
>> +                       data[i++] = ring->stats.bytes;
>> +                       data[i++] = ring->stats.ipsec_offloads;
>>                  } while (u64_stats_fetch_retry_irq(&ring->syncp, start));
>> -               i += 2;
>>          }
>>          for (j = 0; j < IXGBE_NUM_RX_QUEUES; j++) {
>>                  ring = adapter->rx_ring[j];
>>                  if (!ring) {
>> -                       data[i] = 0;
>> -                       data[i+1] = 0;
>> -                       i += 2;
>> +                       data[i++] = 0;
>> +                       data[i++] = 0;
>> +                       data[i++] = 0;
>>                          continue;
>>                  }
>>
>>                  do {
>>                          start = u64_stats_fetch_begin_irq(&ring->syncp);
>> -                       data[i]   = ring->stats.packets;
>> -                       data[i+1] = ring->stats.bytes;
>> +                       data[i++] = ring->stats.packets;
>> +                       data[i++] = ring->stats.bytes;
>> +                       data[i++] = ring->stats.ipsec_offloads;
>>                  } while (u64_stats_fetch_retry_irq(&ring->syncp, start));
>> -               i += 2;
>>          }
>>
>>          for (j = 0; j < IXGBE_MAX_PACKET_BUFFERS; j++) {
>> @@ -1297,12 +1297,16 @@ static void ixgbe_get_strings(struct net_device *netdev, u32 stringset,
>>                          p += ETH_GSTRING_LEN;
>>                          sprintf(p, "tx_queue_%u_bytes", i);
>>                          p += ETH_GSTRING_LEN;
>> +                       sprintf(p, "tx_queue_%u_ipsec_offloads", i);
>> +                       p += ETH_GSTRING_LEN;
>>                  }
>>                  for (i = 0; i < IXGBE_NUM_RX_QUEUES; i++) {
>>                          sprintf(p, "rx_queue_%u_packets", i);
>>                          p += ETH_GSTRING_LEN;
>>                          sprintf(p, "rx_queue_%u_bytes", i);
>>                          p += ETH_GSTRING_LEN;
>> +                       sprintf(p, "rx_queue_%u_ipsec_offloads", i);
>> +                       p += ETH_GSTRING_LEN;
>>                  }
>>                  for (i = 0; i < IXGBE_MAX_PACKET_BUFFERS; i++) {
>>                          sprintf(p, "tx_pb_%u_pxon", i);
> 
> I probably wouldn't bother reporting this per ring. It might make more
> sense to handle this as an adapter statistic.

I agree, it really messes up the output.  However, I like seeing it per 
ring while I'm testing.  I'll move it.

> 
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> index 2a0dd7a..d1220bf 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> @@ -782,6 +782,7 @@ int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring, struct sk_buff *skb,
>>          if (tsa->encrypt)
>>                  itd->flags |= IXGBE_ADVTXD_TUCMD_IPSEC_ENCRYPT_EN;
>>
>> +       tx_ring->stats.ipsec_offloads++;
>>          return 1;
> 
> Instead of doing this here you may want to make it a part of the Tx
> clean-up path. You should still have the flag bit set so you could
> test a test for the IPSEC flag bit and if it is set on the tx_buffer
> following the transmit you could then increment it there.

Is there a benefit to doing it elsewhere?  I'm assuming the answer has 
to do with fastpath cycles...

sln

> 
>>   }
>>
>> @@ -843,6 +844,8 @@ void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
>>          xo = xfrm_offload(skb);
>>          xo->flags = CRYPTO_DONE;
>>          xo->status = CRYPTO_SUCCESS;
>> +
>> +       rx_ring->stats.ipsec_offloads++;
>>   }
>>
>>   /**
>> --
>> 2.7.4
>>
>> _______________________________________________
>> Intel-wired-lan mailing list
>> Intel-wired-lan at osuosl.org
>> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [Intel-wired-lan] [next-queue 10/10] ixgbe: register ipsec offload with the xfrm subsystem
  2017-12-05 20:11     ` Alexander Duyck
@ 2017-12-07  5:43       ` Shannon Nelson
  -1 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-07  5:43 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: intel-wired-lan, Jeff Kirsher, Steffen Klassert,
	Sowmini Varadhan, Netdev

On 12/5/2017 12:11 PM, Alexander Duyck wrote:
> On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
> <shannon.nelson@oracle.com> wrote:
>> With all the support code in place we can now link in the ipsec
>> offload operations and set the ESP feature flag for the XFRM
>> subsystem to see.
>>
>> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
>> ---
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 4 ++++
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  | 4 ++++
>>   2 files changed, 8 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> index d1220bf..0d5497b 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> @@ -884,6 +884,10 @@ void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter)
>>          ixgbe_ipsec_clear_hw_tables(adapter);
>>          ixgbe_ipsec_stop_engine(adapter);
>>
>> +       adapter->netdev->xfrmdev_ops = &ixgbe_xfrmdev_ops;
>> +       adapter->netdev->features |= NETIF_F_HW_ESP;
>> +       adapter->netdev->hw_enc_features |= NETIF_F_HW_ESP;
>> +
>>          return;
>>   err:
>>          if (ipsec) {
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> index c857594..9231351 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> @@ -9799,6 +9799,10 @@ ixgbe_features_check(struct sk_buff *skb, struct net_device *dev,
>>          if (skb->encapsulation && !(features & NETIF_F_TSO_MANGLEID))
>>                  features &= ~NETIF_F_TSO;
>>
>> +       /* IPsec offload doesn't get along well with others *yet* */
>> +       if (skb->sp)
>> +               features &= ~(NETIF_F_TSO | NETIF_F_HW_CSUM_BIT);
> 
> I'm pretty sure the feature flag stripping here isn't correct. The

Well, first of all that NETIF_F_HW_CSUM_BIT should be NETIF_F_HW_CSUM.

> feature bits you want to strip would probably be consistent with the
> network_hdr_len check bits included before the MANGLEID check.
> 
> We should do some digging into this as it may be a kernel issue. I'm
> just wondering if ipsec updates any headers such as the transport
> offset or skb checksum start. If either of those are updated that
> would explain the issues with getting the offloads to work.

Doing this got the TSO and such out of my way so I didn't have to turn 
tx csum off with ethtool, but you're right, this can be tweaked a little.

There will be more digging later when I work on getting TSO and CSUM 
working with ipsec offload, but I want to get these patches out first 
now that they're working, and then tweak for more performance.


Again, thanks for your time and thoughts.

sln

> 
>> +
>>          return features;
>>   }
>>
>> --
>> 2.7.4
>>
>> _______________________________________________
>> Intel-wired-lan mailing list
>> Intel-wired-lan@osuosl.org
>> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 10/10] ixgbe: register ipsec offload with the xfrm subsystem
@ 2017-12-07  5:43       ` Shannon Nelson
  0 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-07  5:43 UTC (permalink / raw)
  To: intel-wired-lan

On 12/5/2017 12:11 PM, Alexander Duyck wrote:
> On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
> <shannon.nelson@oracle.com> wrote:
>> With all the support code in place we can now link in the ipsec
>> offload operations and set the ESP feature flag for the XFRM
>> subsystem to see.
>>
>> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
>> ---
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 4 ++++
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  | 4 ++++
>>   2 files changed, 8 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> index d1220bf..0d5497b 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>> @@ -884,6 +884,10 @@ void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter)
>>          ixgbe_ipsec_clear_hw_tables(adapter);
>>          ixgbe_ipsec_stop_engine(adapter);
>>
>> +       adapter->netdev->xfrmdev_ops = &ixgbe_xfrmdev_ops;
>> +       adapter->netdev->features |= NETIF_F_HW_ESP;
>> +       adapter->netdev->hw_enc_features |= NETIF_F_HW_ESP;
>> +
>>          return;
>>   err:
>>          if (ipsec) {
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> index c857594..9231351 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> @@ -9799,6 +9799,10 @@ ixgbe_features_check(struct sk_buff *skb, struct net_device *dev,
>>          if (skb->encapsulation && !(features & NETIF_F_TSO_MANGLEID))
>>                  features &= ~NETIF_F_TSO;
>>
>> +       /* IPsec offload doesn't get along well with others *yet* */
>> +       if (skb->sp)
>> +               features &= ~(NETIF_F_TSO | NETIF_F_HW_CSUM_BIT);
> 
> I'm pretty sure the feature flag stripping here isn't correct. The

Well, first of all that NETIF_F_HW_CSUM_BIT should be NETIF_F_HW_CSUM.

> feature bits you want to strip would probably be consistent with the
> network_hdr_len check bits included before the MANGLEID check.
> 
> We should do some digging into this as it may be a kernel issue. I'm
> just wondering if ipsec updates any headers such as the transport
> offset or skb checksum start. If either of those are updated that
> would explain the issues with getting the offloads to work.

Doing this got the TSO and such out of my way so I didn't have to turn 
tx csum off with ethtool, but you're right, this can be tweaked a little.

There will be more digging later when I work on getting TSO and CSUM 
working with ipsec offload, but I want to get these patches out first 
now that they're working, and then tweak for more performance.


Again, thanks for your time and thoughts.

sln

> 
>> +
>>          return features;
>>   }
>>
>> --
>> 2.7.4
>>
>> _______________________________________________
>> Intel-wired-lan mailing list
>> Intel-wired-lan at osuosl.org
>> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [Intel-wired-lan] [next-queue 02/10] ixgbe: add ipsec register access routines
  2017-12-07  5:43       ` Shannon Nelson
@ 2017-12-07 16:02         ` Alexander Duyck
  -1 siblings, 0 replies; 78+ messages in thread
From: Alexander Duyck @ 2017-12-07 16:02 UTC (permalink / raw)
  To: Shannon Nelson
  Cc: intel-wired-lan, Jeff Kirsher, Steffen Klassert,
	Sowmini Varadhan, Netdev

On Wed, Dec 6, 2017 at 9:43 PM, Shannon Nelson
<shannon.nelson@oracle.com> wrote:
> Thanks, Alex, for your detailed comments, I do appreciate the time and
> thought you put into them.
>
> Responses below...
>
> sln
>
> On 12/5/2017 8:56 AM, Alexander Duyck wrote:
>>
>> On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
>> <shannon.nelson@oracle.com> wrote:
>>>
>>> Add a few routines to make access to the ipsec registers just a little
>>> easier, and throw in the beginnings of an initialization.
>>>
>>> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
>>> ---
>>>   drivers/net/ethernet/intel/ixgbe/Makefile      |   1 +
>>>   drivers/net/ethernet/intel/ixgbe/ixgbe.h       |   6 +
>>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 157
>>> +++++++++++++++++++++++++
>>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h |  50 ++++++++
>>>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  |   1 +
>>>   5 files changed, 215 insertions(+)
>>>   create mode 100644 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>>   create mode 100644 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
>>>
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/Makefile
>>> b/drivers/net/ethernet/intel/ixgbe/Makefile
>>> index 35e6fa6..8319465 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/Makefile
>>> +++ b/drivers/net/ethernet/intel/ixgbe/Makefile
>>> @@ -42,3 +42,4 @@ ixgbe-$(CONFIG_IXGBE_DCB) +=  ixgbe_dcb.o
>>> ixgbe_dcb_82598.o \
>>>   ixgbe-$(CONFIG_IXGBE_HWMON) += ixgbe_sysfs.o
>>>   ixgbe-$(CONFIG_DEBUG_FS) += ixgbe_debugfs.o
>>>   ixgbe-$(CONFIG_FCOE:m=y) += ixgbe_fcoe.o
>>> +ixgbe-$(CONFIG_XFRM_OFFLOAD) += ixgbe_ipsec.o
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>> index dd55787..1e11462 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>> @@ -52,6 +52,7 @@
>>>   #ifdef CONFIG_IXGBE_DCA
>>>   #include <linux/dca.h>
>>>   #endif
>>> +#include "ixgbe_ipsec.h"
>>>
>>>   #include <net/busy_poll.h>
>>>
>>> @@ -1001,4 +1002,9 @@ void ixgbe_store_key(struct ixgbe_adapter
>>> *adapter);
>>>   void ixgbe_store_reta(struct ixgbe_adapter *adapter);
>>>   s32 ixgbe_negotiate_fc(struct ixgbe_hw *hw, u32 adv_reg, u32 lp_reg,
>>>                         u32 adv_sym, u32 adv_asm, u32 lp_sym, u32
>>> lp_asm);
>>> +#ifdef CONFIG_XFRM_OFFLOAD
>>> +void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter);
>>> +#else
>>> +static inline void ixgbe_init_ipsec_offload(struct ixgbe_adapter
>>> *adapter) { };
>>> +#endif /* CONFIG_XFRM_OFFLOAD */
>>>   #endif /* _IXGBE_H_ */
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>> new file mode 100644
>>> index 0000000..14dd011
>>> --- /dev/null
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>> @@ -0,0 +1,157 @@
>>>
>>> +/*******************************************************************************
>>> + *
>>> + * Intel 10 Gigabit PCI Express Linux driver
>>> + * Copyright(c) 2017 Oracle and/or its affiliates. All rights reserved.
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify
>>> it
>>> + * under the terms and conditions of the GNU General Public License,
>>> + * version 2, as published by the Free Software Foundation.
>>> + *
>>> + * This program is distributed in the hope it will be useful, but
>>> WITHOUT
>>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
>>> for
>>> + * more details.
>>> + *
>>> + * You should have received a copy of the GNU General Public License
>>> along with
>>> + * this program.  If not, see <http://www.gnu.org/licenses/>.
>>> + *
>>> + * The full GNU General Public License is included in this distribution
>>> in
>>> + * the file called "COPYING".
>>> + *
>>> + * Contact Information:
>>> + * Linux NICS <linux.nics@intel.com>
>>> + * e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
>>> + * Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR
>>> 97124-6497
>>> + *
>>> +
>>> ******************************************************************************/
>>> +
>>> +#include "ixgbe.h"
>>> +
>>> +/**
>>> + * ixgbe_ipsec_set_tx_sa - set the Tx SA registers
>>> + * @hw: hw specific details
>>> + * @idx: register index to write
>>> + * @key: key byte array
>>> + * @salt: salt bytes
>>> + **/
>>> +static void ixgbe_ipsec_set_tx_sa(struct ixgbe_hw *hw, u16 idx,
>>> +                                 u32 key[], u32 salt)
>>> +{
>>> +       u32 reg;
>>> +       int i;
>>> +
>>> +       for (i = 0; i < 4; i++)
>>> +               IXGBE_WRITE_REG(hw, IXGBE_IPSTXKEY(i),
>>> cpu_to_be32(key[3-i]));
>>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSTXSALT, cpu_to_be32(salt));
>>> +       IXGBE_WRITE_FLUSH(hw);
>>> +
>>> +       reg = IXGBE_READ_REG(hw, IXGBE_IPSTXIDX);
>>> +       reg &= IXGBE_RXTXIDX_IPS_EN;
>>> +       reg |= idx << 3 | IXGBE_RXTXIDX_IDX_WRITE;
>>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSTXIDX, reg);
>>> +       IXGBE_WRITE_FLUSH(hw);
>>> +}
>>> +
>>
>>
>> So there are a few things here to unpack.
>>
>> The first is the carry-forward of the IPS bit. I'm not sure that is
>> the best way to go. Do we really expect to be updating SA values if
>> IPsec offload is not enabled?
>
>
> In order to save on energy, we don't enable the engine until we have the
> first SA successfully stored in the tables, so the enable bit will be off
> for that one.
>
> Also, the datasheet specifically says for the Rx table "Software should not
> make changes in the Rx SA tables while changing the IPSEC_EN bit." I figured
> I'd use the same method on both tables for consistency.
>
>> If so we may just want to carry a bit
>> flag somewhere in the ixgbe_hw struct indicating if Tx IPsec offload
>> is enabled and use that to determine the value for this bit.
>>
>> Also we should probably replace "3" with a value indicating that it is
>> the SA index shift.
>
>
> Sure, that would be good.
>
>>
>> Also technically the WRITE_FLUSH isn't needed if you are doing a PCIe
>> read anyway to get IPSTXIDX.
>
>
> That's from having to be very fastidious about these reads/writes/flushes
> before the engine actually worked for me.  I could spend time taking them
> out and testing each change again, but they aren't in a fast path, so I'm
> really not worried about it.
>
>
>>
>>> +/**
>>> + * ixgbe_ipsec_set_rx_item - set an Rx table item
>>> + * @hw: hw specific details
>>> + * @idx: register index to write
>>> + * @tbl: table selector
>>> + *
>>> + * Trigger the device to store into a particular Rx table the
>>> + * data that has already been loaded into the input register
>>> + **/
>>> +static void ixgbe_ipsec_set_rx_item(struct ixgbe_hw *hw, u16 idx, u32
>>> tbl)
>>> +{
>>> +       u32 reg;
>>> +
>>> +       reg = IXGBE_READ_REG(hw, IXGBE_IPSRXIDX);
>>> +       reg &= IXGBE_RXTXIDX_IPS_EN;
>>> +       reg |= tbl | idx << 3 | IXGBE_RXTXIDX_IDX_WRITE;
>>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXIDX, reg);
>>> +       IXGBE_WRITE_FLUSH(hw);
>>> +}
>>> +
>>
>>
>> The Rx version of this gets a bit trickier since the datasheet
>> actually indicates there are a few different types of tables that can
>> be indexed via this. Also why is the tbl value not being shifted? It
>> seems like it should be shifted by 1 to avoid overwriting the IPS_EN
>> bit. Really I would like to see the tbl value converted to an enum and
>> shifted by 1 in order to generate the table reference.
>
>
> I would have done this, but we can't use an enum shifted bit because the
> field values are 01, 10, and 11.  I used the direct 2, 4, and 6 values
> rather than shifting by one, but I can reset them and shift by 1.

I didn't mean 1 << enum I was referring to enum << 1. Right now you
can be given a table value of 3 if somebody incorrectly used the
function and the side effect is that it overwrites the enable bit.

>>
>> Here the "3" is a table index. It might be nice to call that out with
>> a name instead of using the magic number.
>
>
> Yep
>
>
>>
>>> +/**
>>> + * ixgbe_ipsec_set_rx_sa - set up the register bits to save SA info
>>> + * @hw: hw specific details
>>> + * @idx: register index to write
>>> + * @spi: security parameter index
>>> + * @key: key byte array
>>> + * @salt: salt bytes
>>> + * @mode: rx decrypt control bits
>>> + * @ip_idx: index into IP table for related IP address
>>> + **/
>>> +static void ixgbe_ipsec_set_rx_sa(struct ixgbe_hw *hw, u16 idx, __be32
>>> spi,
>>> +                                 u32 key[], u32 salt, u32 mode, u32
>>> ip_idx)
>>> +{
>>> +       int i;
>>> +
>>> +       /* store the SPI (in bigendian) and IPidx */
>>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXSPI, spi);
>>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXIPIDX, ip_idx);
>>> +       IXGBE_WRITE_FLUSH(hw);
>>> +
>>> +       ixgbe_ipsec_set_rx_item(hw, idx, IXGBE_RXIDX_TBL_SPI);
>>> +
>>> +       /* store the key, salt, and mode */
>>> +       for (i = 0; i < 4; i++)
>>> +               IXGBE_WRITE_REG(hw, IXGBE_IPSRXKEY(i),
>>> cpu_to_be32(key[3-i]));
>>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXSALT, cpu_to_be32(salt));
>>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXMOD, mode);
>>> +       IXGBE_WRITE_FLUSH(hw);
>>> +
>>> +       ixgbe_ipsec_set_rx_item(hw, idx, IXGBE_RXIDX_TBL_KEY);
>>> +}
>>
>>
>> Is there any reason why you could write the SPI, key, salt, and mode,
>> then flush, and trigger the writes via the IPSRXIDX? Just wondering
>> since it would likely save you a few cycles avoiding PCIe bus stalls.
>
>
> See note above about religiously flushing everything to make a persnickety
> chip work.

I get the flushing. What I am saying is that as far as I can tell the
SPI, salt, and mode don't overlap so you could update all 3, flush,
and then call set_rx_item twice.

>>
>>
>>> +
>>> +/**
>>> + * ixgbe_ipsec_set_rx_ip - set up the register bits to save SA IP addr
>>> info
>>> + * @hw: hw specific details
>>> + * @idx: register index to write
>>> + * @addr: IP address byte array
>>> + **/
>>> +static void ixgbe_ipsec_set_rx_ip(struct ixgbe_hw *hw, u16 idx, u32
>>> addr[])
>>> +{
>>> +       int i;
>>> +
>>> +       /* store the ip address */
>>> +       for (i = 0; i < 4; i++)
>>> +               IXGBE_WRITE_REG(hw, IXGBE_IPSRXIPADDR(i), addr[i]);
>>> +       IXGBE_WRITE_FLUSH(hw);
>>> +
>>> +       ixgbe_ipsec_set_rx_item(hw, idx, IXGBE_RXIDX_TBL_IP);
>>> +}
>>> +
>>
>>
>> This piece is kind of confusing. I would suggest storing the address
>> as a __be32 pointer instead of a u32 array. That way you start with
>> either an IPv6 or an IPv4 address at offset 0 instead of the way the
>> hardware is defined which has you writing it at either 0 or 3
>> depending on if the address is IPv6 or IPv4.
>
>
> Using a __be32 rather than u32 is fine here, it doesn't make much
> difference.
>
> If I understand your suggestion correctly, we would also need an additional
> function parameter to tell us if we were pointing to an ipv6 or ipv4
> address.  Since the driver's SW tables are modeling the HW, I think it is
> simpler to leave it in the array.

Actually I am not too concerned about needing a flag, but the __be32
usage addresses another problem. If I am not mistaken in order to
store an IPv6 value you will have to write addr[3] to IPADDR(0) and so
forth since the hardware is storing the IPv6 address as little endian.
So if you store the IPv4 address in addr[0] as a __be32 value and
leave the rest as zero you should get the correct ordering in either
setup when you store either IPv6 or IPv4 values.

>
>>
>>> +/**
>>> + * ixgbe_ipsec_clear_hw_tables - because some tables don't get cleared
>>> on reset
>>> + * @adapter: board private structure
>>> + **/
>>> +void ixgbe_ipsec_clear_hw_tables(struct ixgbe_adapter *adapter)
>>> +{
>>> +       struct ixgbe_hw *hw = &adapter->hw;
>>> +       u32 buf[4] = {0, 0, 0, 0};
>>> +       u16 idx;
>>> +
>>> +       /* disable Rx and Tx SA lookup */
>>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXIDX, 0);
>>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSTXIDX, 0);
>>> +
>>> +       /* scrub the tables */
>>> +       for (idx = 0; idx < IXGBE_IPSEC_MAX_SA_COUNT; idx++)
>>> +               ixgbe_ipsec_set_tx_sa(hw, idx, buf, 0);
>>> +
>>> +       for (idx = 0; idx < IXGBE_IPSEC_MAX_SA_COUNT; idx++)
>>> +               ixgbe_ipsec_set_rx_sa(hw, idx, 0, buf, 0, 0, 0);
>>> +
>>> +       for (idx = 0; idx < IXGBE_IPSEC_MAX_RX_IP_COUNT; idx++)
>>> +               ixgbe_ipsec_set_rx_ip(hw, idx, buf);
>>> +}
>>> +
>>> +/**
>>> + * ixgbe_init_ipsec_offload - initialize security registers for IPSec
>>> operation
>>> + * @adapter: board private structure
>>> + **/
>>> +void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter)
>>> +{
>>> +       ixgbe_ipsec_clear_hw_tables(adapter);
>>> +}
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
>>> new file mode 100644
>>> index 0000000..017b13f
>>> --- /dev/null
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
>>> @@ -0,0 +1,50 @@
>>>
>>> +/*******************************************************************************
>>> +
>>> +  Intel 10 Gigabit PCI Express Linux driver
>>> +  Copyright(c) 2017 Oracle and/or its affiliates. All rights reserved.
>>> +
>>> +  This program is free software; you can redistribute it and/or modify
>>> it
>>> +  under the terms and conditions of the GNU General Public License,
>>> +  version 2, as published by the Free Software Foundation.
>>> +
>>> +  This program is distributed in the hope it will be useful, but WITHOUT
>>> +  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>>> +  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
>>> for
>>> +  more details.
>>> +
>>> +  You should have received a copy of the GNU General Public License
>>> along with
>>> +  this program.  If not, see <http://www.gnu.org/licenses/>.
>>> +
>>> +  The full GNU General Public License is included in this distribution
>>> in
>>> +  the file called "COPYING".
>>> +
>>> +  Contact Information:
>>> +  Linux NICS <linux.nics@intel.com>
>>> +  e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
>>> +  Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR
>>> 97124-6497
>>> +
>>>
>>> +*******************************************************************************/
>>> +
>>> +#ifndef _IXGBE_IPSEC_H_
>>> +#define _IXGBE_IPSEC_H_
>>> +
>>> +#define IXGBE_IPSEC_MAX_SA_COUNT       1024
>>> +#define IXGBE_IPSEC_MAX_RX_IP_COUNT    128
>>> +#define IXGBE_IPSEC_BASE_RX_INDEX      IXGBE_IPSEC_MAX_SA_COUNT
>>> +#define IXGBE_IPSEC_BASE_TX_INDEX      (IXGBE_IPSEC_MAX_SA_COUNT * 2)
>>> +
>>> +#define IXGBE_RXTXIDX_IPS_EN           0x00000001
>>> +#define IXGBE_RXIDX_TBL_MASK           0x00000006
>>> +#define IXGBE_RXIDX_TBL_IP             0x00000002
>>> +#define IXGBE_RXIDX_TBL_SPI            0x00000004
>>> +#define IXGBE_RXIDX_TBL_KEY            0x00000006
>>
>>
>> You might look at converting these table entries into an enum and add
>> a shift value. It will make things much easier to read.
>>
>>> +#define IXGBE_RXTXIDX_IDX_MASK         0x00001ff8
>>> +#define IXGBE_RXTXIDX_IDX_READ         0x40000000
>>> +#define IXGBE_RXTXIDX_IDX_WRITE                0x80000000
>>> +
>>> +#define IXGBE_RXMOD_VALID              0x00000001
>>> +#define IXGBE_RXMOD_PROTO_ESP          0x00000004
>>> +#define IXGBE_RXMOD_DECRYPT            0x00000008
>>> +#define IXGBE_RXMOD_IPV6               0x00000010
>>> +
>>> +#endif /* _IXGBE_IPSEC_H_ */
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> index 6d5f31e..51fb3cf 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> @@ -10327,6 +10327,7 @@ static int ixgbe_probe(struct pci_dev *pdev,
>>> const struct pci_device_id *ent)
>>>                                           NETIF_F_FCOE_MTU;
>>>          }
>>>   #endif /* IXGBE_FCOE */
>>> +       ixgbe_init_ipsec_offload(adapter);
>>>
>>>          if (adapter->flags2 & IXGBE_FLAG2_RSC_CAPABLE)
>>>                  netdev->hw_features |= NETIF_F_LRO;
>>> --
>>> 2.7.4
>>>
>>> _______________________________________________
>>> Intel-wired-lan mailing list
>>> Intel-wired-lan@osuosl.org
>>> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 02/10] ixgbe: add ipsec register access routines
@ 2017-12-07 16:02         ` Alexander Duyck
  0 siblings, 0 replies; 78+ messages in thread
From: Alexander Duyck @ 2017-12-07 16:02 UTC (permalink / raw)
  To: intel-wired-lan

On Wed, Dec 6, 2017 at 9:43 PM, Shannon Nelson
<shannon.nelson@oracle.com> wrote:
> Thanks, Alex, for your detailed comments, I do appreciate the time and
> thought you put into them.
>
> Responses below...
>
> sln
>
> On 12/5/2017 8:56 AM, Alexander Duyck wrote:
>>
>> On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
>> <shannon.nelson@oracle.com> wrote:
>>>
>>> Add a few routines to make access to the ipsec registers just a little
>>> easier, and throw in the beginnings of an initialization.
>>>
>>> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
>>> ---
>>>   drivers/net/ethernet/intel/ixgbe/Makefile      |   1 +
>>>   drivers/net/ethernet/intel/ixgbe/ixgbe.h       |   6 +
>>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 157
>>> +++++++++++++++++++++++++
>>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h |  50 ++++++++
>>>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  |   1 +
>>>   5 files changed, 215 insertions(+)
>>>   create mode 100644 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>>   create mode 100644 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
>>>
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/Makefile
>>> b/drivers/net/ethernet/intel/ixgbe/Makefile
>>> index 35e6fa6..8319465 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/Makefile
>>> +++ b/drivers/net/ethernet/intel/ixgbe/Makefile
>>> @@ -42,3 +42,4 @@ ixgbe-$(CONFIG_IXGBE_DCB) +=  ixgbe_dcb.o
>>> ixgbe_dcb_82598.o \
>>>   ixgbe-$(CONFIG_IXGBE_HWMON) += ixgbe_sysfs.o
>>>   ixgbe-$(CONFIG_DEBUG_FS) += ixgbe_debugfs.o
>>>   ixgbe-$(CONFIG_FCOE:m=y) += ixgbe_fcoe.o
>>> +ixgbe-$(CONFIG_XFRM_OFFLOAD) += ixgbe_ipsec.o
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>> index dd55787..1e11462 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>> @@ -52,6 +52,7 @@
>>>   #ifdef CONFIG_IXGBE_DCA
>>>   #include <linux/dca.h>
>>>   #endif
>>> +#include "ixgbe_ipsec.h"
>>>
>>>   #include <net/busy_poll.h>
>>>
>>> @@ -1001,4 +1002,9 @@ void ixgbe_store_key(struct ixgbe_adapter
>>> *adapter);
>>>   void ixgbe_store_reta(struct ixgbe_adapter *adapter);
>>>   s32 ixgbe_negotiate_fc(struct ixgbe_hw *hw, u32 adv_reg, u32 lp_reg,
>>>                         u32 adv_sym, u32 adv_asm, u32 lp_sym, u32
>>> lp_asm);
>>> +#ifdef CONFIG_XFRM_OFFLOAD
>>> +void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter);
>>> +#else
>>> +static inline void ixgbe_init_ipsec_offload(struct ixgbe_adapter
>>> *adapter) { };
>>> +#endif /* CONFIG_XFRM_OFFLOAD */
>>>   #endif /* _IXGBE_H_ */
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>> new file mode 100644
>>> index 0000000..14dd011
>>> --- /dev/null
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>> @@ -0,0 +1,157 @@
>>>
>>> +/*******************************************************************************
>>> + *
>>> + * Intel 10 Gigabit PCI Express Linux driver
>>> + * Copyright(c) 2017 Oracle and/or its affiliates. All rights reserved.
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify
>>> it
>>> + * under the terms and conditions of the GNU General Public License,
>>> + * version 2, as published by the Free Software Foundation.
>>> + *
>>> + * This program is distributed in the hope it will be useful, but
>>> WITHOUT
>>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
>>> for
>>> + * more details.
>>> + *
>>> + * You should have received a copy of the GNU General Public License
>>> along with
>>> + * this program.  If not, see <http://www.gnu.org/licenses/>.
>>> + *
>>> + * The full GNU General Public License is included in this distribution
>>> in
>>> + * the file called "COPYING".
>>> + *
>>> + * Contact Information:
>>> + * Linux NICS <linux.nics@intel.com>
>>> + * e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
>>> + * Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR
>>> 97124-6497
>>> + *
>>> +
>>> ******************************************************************************/
>>> +
>>> +#include "ixgbe.h"
>>> +
>>> +/**
>>> + * ixgbe_ipsec_set_tx_sa - set the Tx SA registers
>>> + * @hw: hw specific details
>>> + * @idx: register index to write
>>> + * @key: key byte array
>>> + * @salt: salt bytes
>>> + **/
>>> +static void ixgbe_ipsec_set_tx_sa(struct ixgbe_hw *hw, u16 idx,
>>> +                                 u32 key[], u32 salt)
>>> +{
>>> +       u32 reg;
>>> +       int i;
>>> +
>>> +       for (i = 0; i < 4; i++)
>>> +               IXGBE_WRITE_REG(hw, IXGBE_IPSTXKEY(i),
>>> cpu_to_be32(key[3-i]));
>>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSTXSALT, cpu_to_be32(salt));
>>> +       IXGBE_WRITE_FLUSH(hw);
>>> +
>>> +       reg = IXGBE_READ_REG(hw, IXGBE_IPSTXIDX);
>>> +       reg &= IXGBE_RXTXIDX_IPS_EN;
>>> +       reg |= idx << 3 | IXGBE_RXTXIDX_IDX_WRITE;
>>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSTXIDX, reg);
>>> +       IXGBE_WRITE_FLUSH(hw);
>>> +}
>>> +
>>
>>
>> So there are a few things here to unpack.
>>
>> The first is the carry-forward of the IPS bit. I'm not sure that is
>> the best way to go. Do we really expect to be updating SA values if
>> IPsec offload is not enabled?
>
>
> In order to save on energy, we don't enable the engine until we have the
> first SA successfully stored in the tables, so the enable bit will be off
> for that one.
>
> Also, the datasheet specifically says for the Rx table "Software should not
> make changes in the Rx SA tables while changing the IPSEC_EN bit." I figured
> I'd use the same method on both tables for consistency.
>
>> If so we may just want to carry a bit
>> flag somewhere in the ixgbe_hw struct indicating if Tx IPsec offload
>> is enabled and use that to determine the value for this bit.
>>
>> Also we should probably replace "3" with a value indicating that it is
>> the SA index shift.
>
>
> Sure, that would be good.
>
>>
>> Also technically the WRITE_FLUSH isn't needed if you are doing a PCIe
>> read anyway to get IPSTXIDX.
>
>
> That's from having to be very fastidious about these reads/writes/flushes
> before the engine actually worked for me.  I could spend time taking them
> out and testing each change again, but they aren't in a fast path, so I'm
> really not worried about it.
>
>
>>
>>> +/**
>>> + * ixgbe_ipsec_set_rx_item - set an Rx table item
>>> + * @hw: hw specific details
>>> + * @idx: register index to write
>>> + * @tbl: table selector
>>> + *
>>> + * Trigger the device to store into a particular Rx table the
>>> + * data that has already been loaded into the input register
>>> + **/
>>> +static void ixgbe_ipsec_set_rx_item(struct ixgbe_hw *hw, u16 idx, u32
>>> tbl)
>>> +{
>>> +       u32 reg;
>>> +
>>> +       reg = IXGBE_READ_REG(hw, IXGBE_IPSRXIDX);
>>> +       reg &= IXGBE_RXTXIDX_IPS_EN;
>>> +       reg |= tbl | idx << 3 | IXGBE_RXTXIDX_IDX_WRITE;
>>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXIDX, reg);
>>> +       IXGBE_WRITE_FLUSH(hw);
>>> +}
>>> +
>>
>>
>> The Rx version of this gets a bit trickier since the datasheet
>> actually indicates there are a few different types of tables that can
>> be indexed via this. Also why is the tbl value not being shifted? It
>> seems like it should be shifted by 1 to avoid overwriting the IPS_EN
>> bit. Really I would like to see the tbl value converted to an enum and
>> shifted by 1 in order to generate the table reference.
>
>
> I would have done this, but we can't use an enum shifted bit because the
> field values are 01, 10, and 11.  I used the direct 2, 4, and 6 values
> rather than shifting by one, but I can reset them and shift by 1.

I didn't mean 1 << enum I was referring to enum << 1. Right now you
can be given a table value of 3 if somebody incorrectly used the
function and the side effect is that it overwrites the enable bit.

>>
>> Here the "3" is a table index. It might be nice to call that out with
>> a name instead of using the magic number.
>
>
> Yep
>
>
>>
>>> +/**
>>> + * ixgbe_ipsec_set_rx_sa - set up the register bits to save SA info
>>> + * @hw: hw specific details
>>> + * @idx: register index to write
>>> + * @spi: security parameter index
>>> + * @key: key byte array
>>> + * @salt: salt bytes
>>> + * @mode: rx decrypt control bits
>>> + * @ip_idx: index into IP table for related IP address
>>> + **/
>>> +static void ixgbe_ipsec_set_rx_sa(struct ixgbe_hw *hw, u16 idx, __be32
>>> spi,
>>> +                                 u32 key[], u32 salt, u32 mode, u32
>>> ip_idx)
>>> +{
>>> +       int i;
>>> +
>>> +       /* store the SPI (in bigendian) and IPidx */
>>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXSPI, spi);
>>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXIPIDX, ip_idx);
>>> +       IXGBE_WRITE_FLUSH(hw);
>>> +
>>> +       ixgbe_ipsec_set_rx_item(hw, idx, IXGBE_RXIDX_TBL_SPI);
>>> +
>>> +       /* store the key, salt, and mode */
>>> +       for (i = 0; i < 4; i++)
>>> +               IXGBE_WRITE_REG(hw, IXGBE_IPSRXKEY(i),
>>> cpu_to_be32(key[3-i]));
>>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXSALT, cpu_to_be32(salt));
>>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXMOD, mode);
>>> +       IXGBE_WRITE_FLUSH(hw);
>>> +
>>> +       ixgbe_ipsec_set_rx_item(hw, idx, IXGBE_RXIDX_TBL_KEY);
>>> +}
>>
>>
>> Is there any reason why you could write the SPI, key, salt, and mode,
>> then flush, and trigger the writes via the IPSRXIDX? Just wondering
>> since it would likely save you a few cycles avoiding PCIe bus stalls.
>
>
> See note above about religiously flushing everything to make a persnickety
> chip work.

I get the flushing. What I am saying is that as far as I can tell the
SPI, salt, and mode don't overlap so you could update all 3, flush,
and then call set_rx_item twice.

>>
>>
>>> +
>>> +/**
>>> + * ixgbe_ipsec_set_rx_ip - set up the register bits to save SA IP addr
>>> info
>>> + * @hw: hw specific details
>>> + * @idx: register index to write
>>> + * @addr: IP address byte array
>>> + **/
>>> +static void ixgbe_ipsec_set_rx_ip(struct ixgbe_hw *hw, u16 idx, u32
>>> addr[])
>>> +{
>>> +       int i;
>>> +
>>> +       /* store the ip address */
>>> +       for (i = 0; i < 4; i++)
>>> +               IXGBE_WRITE_REG(hw, IXGBE_IPSRXIPADDR(i), addr[i]);
>>> +       IXGBE_WRITE_FLUSH(hw);
>>> +
>>> +       ixgbe_ipsec_set_rx_item(hw, idx, IXGBE_RXIDX_TBL_IP);
>>> +}
>>> +
>>
>>
>> This piece is kind of confusing. I would suggest storing the address
>> as a __be32 pointer instead of a u32 array. That way you start with
>> either an IPv6 or an IPv4 address at offset 0 instead of the way the
>> hardware is defined which has you writing it at either 0 or 3
>> depending on if the address is IPv6 or IPv4.
>
>
> Using a __be32 rather than u32 is fine here, it doesn't make much
> difference.
>
> If I understand your suggestion correctly, we would also need an additional
> function parameter to tell us if we were pointing to an ipv6 or ipv4
> address.  Since the driver's SW tables are modeling the HW, I think it is
> simpler to leave it in the array.

Actually I am not too concerned about needing a flag, but the __be32
usage addresses another problem. If I am not mistaken in order to
store an IPv6 value you will have to write addr[3] to IPADDR(0) and so
forth since the hardware is storing the IPv6 address as little endian.
So if you store the IPv4 address in addr[0] as a __be32 value and
leave the rest as zero you should get the correct ordering in either
setup when you store either IPv6 or IPv4 values.

>
>>
>>> +/**
>>> + * ixgbe_ipsec_clear_hw_tables - because some tables don't get cleared
>>> on reset
>>> + * @adapter: board private structure
>>> + **/
>>> +void ixgbe_ipsec_clear_hw_tables(struct ixgbe_adapter *adapter)
>>> +{
>>> +       struct ixgbe_hw *hw = &adapter->hw;
>>> +       u32 buf[4] = {0, 0, 0, 0};
>>> +       u16 idx;
>>> +
>>> +       /* disable Rx and Tx SA lookup */
>>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXIDX, 0);
>>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSTXIDX, 0);
>>> +
>>> +       /* scrub the tables */
>>> +       for (idx = 0; idx < IXGBE_IPSEC_MAX_SA_COUNT; idx++)
>>> +               ixgbe_ipsec_set_tx_sa(hw, idx, buf, 0);
>>> +
>>> +       for (idx = 0; idx < IXGBE_IPSEC_MAX_SA_COUNT; idx++)
>>> +               ixgbe_ipsec_set_rx_sa(hw, idx, 0, buf, 0, 0, 0);
>>> +
>>> +       for (idx = 0; idx < IXGBE_IPSEC_MAX_RX_IP_COUNT; idx++)
>>> +               ixgbe_ipsec_set_rx_ip(hw, idx, buf);
>>> +}
>>> +
>>> +/**
>>> + * ixgbe_init_ipsec_offload - initialize security registers for IPSec
>>> operation
>>> + * @adapter: board private structure
>>> + **/
>>> +void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter)
>>> +{
>>> +       ixgbe_ipsec_clear_hw_tables(adapter);
>>> +}
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
>>> new file mode 100644
>>> index 0000000..017b13f
>>> --- /dev/null
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.h
>>> @@ -0,0 +1,50 @@
>>>
>>> +/*******************************************************************************
>>> +
>>> +  Intel 10 Gigabit PCI Express Linux driver
>>> +  Copyright(c) 2017 Oracle and/or its affiliates. All rights reserved.
>>> +
>>> +  This program is free software; you can redistribute it and/or modify
>>> it
>>> +  under the terms and conditions of the GNU General Public License,
>>> +  version 2, as published by the Free Software Foundation.
>>> +
>>> +  This program is distributed in the hope it will be useful, but WITHOUT
>>> +  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>>> +  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
>>> for
>>> +  more details.
>>> +
>>> +  You should have received a copy of the GNU General Public License
>>> along with
>>> +  this program.  If not, see <http://www.gnu.org/licenses/>.
>>> +
>>> +  The full GNU General Public License is included in this distribution
>>> in
>>> +  the file called "COPYING".
>>> +
>>> +  Contact Information:
>>> +  Linux NICS <linux.nics@intel.com>
>>> +  e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
>>> +  Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR
>>> 97124-6497
>>> +
>>>
>>> +*******************************************************************************/
>>> +
>>> +#ifndef _IXGBE_IPSEC_H_
>>> +#define _IXGBE_IPSEC_H_
>>> +
>>> +#define IXGBE_IPSEC_MAX_SA_COUNT       1024
>>> +#define IXGBE_IPSEC_MAX_RX_IP_COUNT    128
>>> +#define IXGBE_IPSEC_BASE_RX_INDEX      IXGBE_IPSEC_MAX_SA_COUNT
>>> +#define IXGBE_IPSEC_BASE_TX_INDEX      (IXGBE_IPSEC_MAX_SA_COUNT * 2)
>>> +
>>> +#define IXGBE_RXTXIDX_IPS_EN           0x00000001
>>> +#define IXGBE_RXIDX_TBL_MASK           0x00000006
>>> +#define IXGBE_RXIDX_TBL_IP             0x00000002
>>> +#define IXGBE_RXIDX_TBL_SPI            0x00000004
>>> +#define IXGBE_RXIDX_TBL_KEY            0x00000006
>>
>>
>> You might look at converting these table entries into an enum and add
>> a shift value. It will make things much easier to read.
>>
>>> +#define IXGBE_RXTXIDX_IDX_MASK         0x00001ff8
>>> +#define IXGBE_RXTXIDX_IDX_READ         0x40000000
>>> +#define IXGBE_RXTXIDX_IDX_WRITE                0x80000000
>>> +
>>> +#define IXGBE_RXMOD_VALID              0x00000001
>>> +#define IXGBE_RXMOD_PROTO_ESP          0x00000004
>>> +#define IXGBE_RXMOD_DECRYPT            0x00000008
>>> +#define IXGBE_RXMOD_IPV6               0x00000010
>>> +
>>> +#endif /* _IXGBE_IPSEC_H_ */
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> index 6d5f31e..51fb3cf 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> @@ -10327,6 +10327,7 @@ static int ixgbe_probe(struct pci_dev *pdev,
>>> const struct pci_device_id *ent)
>>>                                           NETIF_F_FCOE_MTU;
>>>          }
>>>   #endif /* IXGBE_FCOE */
>>> +       ixgbe_init_ipsec_offload(adapter);
>>>
>>>          if (adapter->flags2 & IXGBE_FLAG2_RSC_CAPABLE)
>>>                  netdev->hw_features |= NETIF_F_LRO;
>>> --
>>> 2.7.4
>>>
>>> _______________________________________________
>>> Intel-wired-lan mailing list
>>> Intel-wired-lan at osuosl.org
>>> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [Intel-wired-lan] [next-queue 02/10] ixgbe: add ipsec register access routines
  2017-12-07 16:02         ` Alexander Duyck
@ 2017-12-07 17:03           ` Shannon Nelson
  -1 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-07 17:03 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: intel-wired-lan, Jeff Kirsher, Steffen Klassert,
	Sowmini Varadhan, Netdev

On 12/7/2017 8:02 AM, Alexander Duyck wrote:
> On Wed, Dec 6, 2017 at 9:43 PM, Shannon Nelson
> <shannon.nelson@oracle.com> wrote:
>> Thanks, Alex, for your detailed comments, I do appreciate the time and
>> thought you put into them.
>>
>> Responses below...
>>
>> sln
>>
>> On 12/5/2017 8:56 AM, Alexander Duyck wrote:

<snip>

>>>> +       reg = IXGBE_READ_REG(hw, IXGBE_IPSRXIDX);
>>>> +       reg &= IXGBE_RXTXIDX_IPS_EN;
>>>> +       reg |= tbl | idx << 3 | IXGBE_RXTXIDX_IDX_WRITE;
>>>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXIDX, reg);
>>>> +       IXGBE_WRITE_FLUSH(hw);
>>>> +}
>>>> +
>>>
>>>
>>> The Rx version of this gets a bit trickier since the datasheet
>>> actually indicates there are a few different types of tables that can
>>> be indexed via this. Also why is the tbl value not being shifted? It
>>> seems like it should be shifted by 1 to avoid overwriting the IPS_EN
>>> bit. Really I would like to see the tbl value converted to an enum and
>>> shifted by 1 in order to generate the table reference.
>>
>>
>> I would have done this, but we can't use an enum shifted bit because the
>> field values are 01, 10, and 11.  I used the direct 2, 4, and 6 values
>> rather than shifting by one, but I can reset them and shift by 1.
> 
> I didn't mean 1 << enum I was referring to enum << 1. Right now you
> can be given a table value of 3 if somebody incorrectly used the
> function and the side effect is that it overwrites the enable bit.

Okay, sure, that makes sense.

> 
>>>

<snip>

>>>> +       /* store the SPI (in bigendian) and IPidx */
>>>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXSPI, spi);
>>>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXIPIDX, ip_idx);
>>>> +       IXGBE_WRITE_FLUSH(hw);
>>>> +
>>>> +       ixgbe_ipsec_set_rx_item(hw, idx, IXGBE_RXIDX_TBL_SPI);
>>>> +
>>>> +       /* store the key, salt, and mode */
>>>> +       for (i = 0; i < 4; i++)
>>>> +               IXGBE_WRITE_REG(hw, IXGBE_IPSRXKEY(i),
>>>> cpu_to_be32(key[3-i]));
>>>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXSALT, cpu_to_be32(salt));
>>>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXMOD, mode);
>>>> +       IXGBE_WRITE_FLUSH(hw);
>>>> +
>>>> +       ixgbe_ipsec_set_rx_item(hw, idx, IXGBE_RXIDX_TBL_KEY);
>>>> +}
>>>
>>>
>>> Is there any reason why you could write the SPI, key, salt, and mode,
>>> then flush, and trigger the writes via the IPSRXIDX? Just wondering
>>> since it would likely save you a few cycles avoiding PCIe bus stalls.
>>
>>
>> See note above about religiously flushing everything to make a persnickety
>> chip work.
> 
> I get the flushing. What I am saying is that as far as I can tell the
> SPI, salt, and mode don't overlap so you could update all 3, flush,
> and then call set_rx_item twice.

I'll check that for here and a possibly a couple other places.

> 
>>>
>>>
>>>> +
>>>> +/**
>>>> + * ixgbe_ipsec_set_rx_ip - set up the register bits to save SA IP addr
>>>> info
>>>> + * @hw: hw specific details
>>>> + * @idx: register index to write
>>>> + * @addr: IP address byte array
>>>> + **/
>>>> +static void ixgbe_ipsec_set_rx_ip(struct ixgbe_hw *hw, u16 idx, u32
>>>> addr[])
>>>> +{
>>>> +       int i;
>>>> +
>>>> +       /* store the ip address */
>>>> +       for (i = 0; i < 4; i++)
>>>> +               IXGBE_WRITE_REG(hw, IXGBE_IPSRXIPADDR(i), addr[i]);
>>>> +       IXGBE_WRITE_FLUSH(hw);
>>>> +
>>>> +       ixgbe_ipsec_set_rx_item(hw, idx, IXGBE_RXIDX_TBL_IP);
>>>> +}
>>>> +
>>>
>>>
>>> This piece is kind of confusing. I would suggest storing the address
>>> as a __be32 pointer instead of a u32 array. That way you start with
>>> either an IPv6 or an IPv4 address at offset 0 instead of the way the
>>> hardware is defined which has you writing it at either 0 or 3
>>> depending on if the address is IPv6 or IPv4.
>>
>>
>> Using a __be32 rather than u32 is fine here, it doesn't make much
>> difference.
>>
>> If I understand your suggestion correctly, we would also need an additional
>> function parameter to tell us if we were pointing to an ipv6 or ipv4
>> address.  Since the driver's SW tables are modeling the HW, I think it is
>> simpler to leave it in the array.
> 
> Actually I am not too concerned about needing a flag, but the __be32
> usage addresses another problem. If I am not mistaken in order to
> store an IPv6 value you will have to write addr[3] to IPADDR(0) and so
> forth since the hardware is storing the IPv6 address as little endian.
> So if you store the IPv4 address in addr[0] as a __be32 value and
> leave the rest as zero you should get the correct ordering in either
> setup when you store either IPv6 or IPv4 values.

The datasheet says
   n=0 contains the MSB for an IPv6 IP Address.
   n=3 contains an IPv4 IP Address or the LSB for an IPv6 IP Address.
If the ipv6 address is handed to us in bigendian, then I think we're okay.

Obviously this is something that will get tested when I get around to 
fixing up support for ipv6 in the future, and perhaps I'll be surprised.

sln

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 02/10] ixgbe: add ipsec register access routines
@ 2017-12-07 17:03           ` Shannon Nelson
  0 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-07 17:03 UTC (permalink / raw)
  To: intel-wired-lan

On 12/7/2017 8:02 AM, Alexander Duyck wrote:
> On Wed, Dec 6, 2017 at 9:43 PM, Shannon Nelson
> <shannon.nelson@oracle.com> wrote:
>> Thanks, Alex, for your detailed comments, I do appreciate the time and
>> thought you put into them.
>>
>> Responses below...
>>
>> sln
>>
>> On 12/5/2017 8:56 AM, Alexander Duyck wrote:

<snip>

>>>> +       reg = IXGBE_READ_REG(hw, IXGBE_IPSRXIDX);
>>>> +       reg &= IXGBE_RXTXIDX_IPS_EN;
>>>> +       reg |= tbl | idx << 3 | IXGBE_RXTXIDX_IDX_WRITE;
>>>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXIDX, reg);
>>>> +       IXGBE_WRITE_FLUSH(hw);
>>>> +}
>>>> +
>>>
>>>
>>> The Rx version of this gets a bit trickier since the datasheet
>>> actually indicates there are a few different types of tables that can
>>> be indexed via this. Also why is the tbl value not being shifted? It
>>> seems like it should be shifted by 1 to avoid overwriting the IPS_EN
>>> bit. Really I would like to see the tbl value converted to an enum and
>>> shifted by 1 in order to generate the table reference.
>>
>>
>> I would have done this, but we can't use an enum shifted bit because the
>> field values are 01, 10, and 11.  I used the direct 2, 4, and 6 values
>> rather than shifting by one, but I can reset them and shift by 1.
> 
> I didn't mean 1 << enum I was referring to enum << 1. Right now you
> can be given a table value of 3 if somebody incorrectly used the
> function and the side effect is that it overwrites the enable bit.

Okay, sure, that makes sense.

> 
>>>

<snip>

>>>> +       /* store the SPI (in bigendian) and IPidx */
>>>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXSPI, spi);
>>>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXIPIDX, ip_idx);
>>>> +       IXGBE_WRITE_FLUSH(hw);
>>>> +
>>>> +       ixgbe_ipsec_set_rx_item(hw, idx, IXGBE_RXIDX_TBL_SPI);
>>>> +
>>>> +       /* store the key, salt, and mode */
>>>> +       for (i = 0; i < 4; i++)
>>>> +               IXGBE_WRITE_REG(hw, IXGBE_IPSRXKEY(i),
>>>> cpu_to_be32(key[3-i]));
>>>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXSALT, cpu_to_be32(salt));
>>>> +       IXGBE_WRITE_REG(hw, IXGBE_IPSRXMOD, mode);
>>>> +       IXGBE_WRITE_FLUSH(hw);
>>>> +
>>>> +       ixgbe_ipsec_set_rx_item(hw, idx, IXGBE_RXIDX_TBL_KEY);
>>>> +}
>>>
>>>
>>> Is there any reason why you could write the SPI, key, salt, and mode,
>>> then flush, and trigger the writes via the IPSRXIDX? Just wondering
>>> since it would likely save you a few cycles avoiding PCIe bus stalls.
>>
>>
>> See note above about religiously flushing everything to make a persnickety
>> chip work.
> 
> I get the flushing. What I am saying is that as far as I can tell the
> SPI, salt, and mode don't overlap so you could update all 3, flush,
> and then call set_rx_item twice.

I'll check that for here and a possibly a couple other places.

> 
>>>
>>>
>>>> +
>>>> +/**
>>>> + * ixgbe_ipsec_set_rx_ip - set up the register bits to save SA IP addr
>>>> info
>>>> + * @hw: hw specific details
>>>> + * @idx: register index to write
>>>> + * @addr: IP address byte array
>>>> + **/
>>>> +static void ixgbe_ipsec_set_rx_ip(struct ixgbe_hw *hw, u16 idx, u32
>>>> addr[])
>>>> +{
>>>> +       int i;
>>>> +
>>>> +       /* store the ip address */
>>>> +       for (i = 0; i < 4; i++)
>>>> +               IXGBE_WRITE_REG(hw, IXGBE_IPSRXIPADDR(i), addr[i]);
>>>> +       IXGBE_WRITE_FLUSH(hw);
>>>> +
>>>> +       ixgbe_ipsec_set_rx_item(hw, idx, IXGBE_RXIDX_TBL_IP);
>>>> +}
>>>> +
>>>
>>>
>>> This piece is kind of confusing. I would suggest storing the address
>>> as a __be32 pointer instead of a u32 array. That way you start with
>>> either an IPv6 or an IPv4 address at offset 0 instead of the way the
>>> hardware is defined which has you writing it at either 0 or 3
>>> depending on if the address is IPv6 or IPv4.
>>
>>
>> Using a __be32 rather than u32 is fine here, it doesn't make much
>> difference.
>>
>> If I understand your suggestion correctly, we would also need an additional
>> function parameter to tell us if we were pointing to an ipv6 or ipv4
>> address.  Since the driver's SW tables are modeling the HW, I think it is
>> simpler to leave it in the array.
> 
> Actually I am not too concerned about needing a flag, but the __be32
> usage addresses another problem. If I am not mistaken in order to
> store an IPv6 value you will have to write addr[3] to IPADDR(0) and so
> forth since the hardware is storing the IPv6 address as little endian.
> So if you store the IPv4 address in addr[0] as a __be32 value and
> leave the rest as zero you should get the correct ordering in either
> setup when you store either IPv6 or IPv4 values.

The datasheet says
   n=0 contains the MSB for an IPv6 IP Address.
   n=3 contains an IPv4 IP Address or the LSB for an IPv6 IP Address.
If the ipv6 address is handed to us in bigendian, then I think we're okay.

Obviously this is something that will get tested when I get around to 
fixing up support for ipv6 in the future, and perhaps I'll be surprised.

sln

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [Intel-wired-lan] [next-queue 06/10] ixgbe: restore offloaded SAs after a reset
  2017-12-07  5:43       ` Shannon Nelson
@ 2017-12-07 17:16         ` Alexander Duyck
  -1 siblings, 0 replies; 78+ messages in thread
From: Alexander Duyck @ 2017-12-07 17:16 UTC (permalink / raw)
  To: Shannon Nelson
  Cc: intel-wired-lan, Jeff Kirsher, Steffen Klassert,
	Sowmini Varadhan, Netdev

On Wed, Dec 6, 2017 at 9:43 PM, Shannon Nelson
<shannon.nelson@oracle.com> wrote:
> On 12/5/2017 9:30 AM, Alexander Duyck wrote:
>>
>> On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
>> <shannon.nelson@oracle.com> wrote:
>>>
>>> On a chip reset most of the table contents are lost, so must be
>>> restored.  This scans the driver's ipsec tables and restores both
>>> the filled and empty table slots to their pre-reset values.
>>>
>>> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
>>> ---
>>>   drivers/net/ethernet/intel/ixgbe/ixgbe.h       |  2 +
>>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 53
>>> ++++++++++++++++++++++++++
>>>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  |  1 +
>>>   3 files changed, 56 insertions(+)
>>>
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>> index 9487750..7e8bca7 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>> @@ -1009,7 +1009,9 @@ s32 ixgbe_negotiate_fc(struct ixgbe_hw *hw, u32
>>> adv_reg, u32 lp_reg,
>>>                         u32 adv_sym, u32 adv_asm, u32 lp_sym, u32
>>> lp_asm);
>>>   #ifdef CONFIG_XFRM_OFFLOAD
>>>   void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter);
>>> +void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter);
>>>   #else
>>>   static inline void ixgbe_init_ipsec_offload(struct ixgbe_adapter
>>> *adapter) { };
>>> +static inline void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter) {
>>> };
>>>   #endif /* CONFIG_XFRM_OFFLOAD */
>>>   #endif /* _IXGBE_H_ */
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>> index 7b01d92..b93ee7f 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>> @@ -292,6 +292,59 @@ static void ixgbe_ipsec_start_engine(struct
>>> ixgbe_adapter *adapter)
>>>   }
>>>
>>>   /**
>>> + * ixgbe_ipsec_restore - restore the ipsec HW settings after a reset
>>> + * @adapter: board private structure
>>> + **/
>>> +void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter)
>>> +{
>>> +       struct ixgbe_ipsec *ipsec = adapter->ipsec;
>>> +       struct ixgbe_hw *hw = &adapter->hw;
>>> +       u32 zbuf[4] = {0, 0, 0, 0};
>>
>>
>> zbuf should be a static const.
>
>
> Yep
>
>>
>>> +       int i;
>>> +
>>> +       if (!(adapter->flags2 & IXGBE_FLAG2_IPSEC_ENABLED))
>>> +               return;
>>> +
>>> +       /* clean up the engine settings */
>>> +       ixgbe_ipsec_stop_engine(adapter);
>>> +
>>> +       /* start the engine */
>>> +       ixgbe_ipsec_start_engine(adapter);
>>> +
>>> +       /* reload the IP addrs */
>>> +       for (i = 0; i < IXGBE_IPSEC_MAX_RX_IP_COUNT; i++) {
>>> +               struct rx_ip_sa *ipsa = &ipsec->ip_tbl[i];
>>> +
>>> +               if (ipsa->used)
>>> +                       ixgbe_ipsec_set_rx_ip(hw, i, ipsa->ipaddr);
>>> +               else
>>> +                       ixgbe_ipsec_set_rx_ip(hw, i, zbuf);
>>
>>
>> If we are doing a restore do we actually need to write the zero
>> values? If we did a reset I thought you had a function that was going
>> through and zeroing everything out. If so this now becomes redundant.
>
>
> Currently ixgbe_ipsec_clear_hw_tables() only gets run at at probe.  It
> should probably get run at remove as well.  Doing this is a bit of safety
> paranoia, and making sure the CAM memory structures that don't get cleared
> on reset have exactly what I expect in them.

You might just move ixgbe_ipsec_clear_hw_tables into the rest logic
itself. Then it covers all cases where you would be resetting the
hardware and expecting a consistent state. It will mean writing some
registers twice during the reset but it is probably better just to
make certain everything stays in a known good state after a reset.

>>
>>> +       }
>>> +
>>> +       /* reload the Rx keys */
>>> +       for (i = 0; i < IXGBE_IPSEC_MAX_SA_COUNT; i++) {
>>> +               struct rx_sa *rsa = &ipsec->rx_tbl[i];
>>> +
>>> +               if (rsa->used)
>>> +                       ixgbe_ipsec_set_rx_sa(hw, i, rsa->xs->id.spi,
>>> +                                             rsa->key, rsa->salt,
>>> +                                             rsa->mode, rsa->iptbl_ind);
>>> +               else
>>> +                       ixgbe_ipsec_set_rx_sa(hw, i, 0, zbuf, 0, 0, 0);
>>
>>
>> same here
>>
>>> +       }
>>> +
>>> +       /* reload the Tx keys */
>>> +       for (i = 0; i < IXGBE_IPSEC_MAX_SA_COUNT; i++) {
>>> +               struct tx_sa *tsa = &ipsec->tx_tbl[i];
>>> +
>>> +               if (tsa->used)
>>> +                       ixgbe_ipsec_set_tx_sa(hw, i, tsa->key,
>>> tsa->salt);
>>> +               else
>>> +                       ixgbe_ipsec_set_tx_sa(hw, i, zbuf, 0);
>>
>>
>> and here
>>
>>> +       }
>>> +}
>>> +
>>> +/**
>>>    * ixgbe_ipsec_find_empty_idx - find the first unused security
>>> parameter index
>>>    * @ipsec: pointer to ipsec struct
>>>    * @rxtable: true if we need to look in the Rx table
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> index 01fd89b..6eabf92 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> @@ -5347,6 +5347,7 @@ static void ixgbe_configure(struct ixgbe_adapter
>>> *adapter)
>>>
>>>          ixgbe_set_rx_mode(adapter->netdev);
>>>          ixgbe_restore_vlan(adapter);
>>> +       ixgbe_ipsec_restore(adapter);
>>>
>>>          switch (hw->mac.type) {
>>>          case ixgbe_mac_82599EB:
>>> --
>>> 2.7.4
>>>
>>> _______________________________________________
>>> Intel-wired-lan mailing list
>>> Intel-wired-lan@osuosl.org
>>> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 06/10] ixgbe: restore offloaded SAs after a reset
@ 2017-12-07 17:16         ` Alexander Duyck
  0 siblings, 0 replies; 78+ messages in thread
From: Alexander Duyck @ 2017-12-07 17:16 UTC (permalink / raw)
  To: intel-wired-lan

On Wed, Dec 6, 2017 at 9:43 PM, Shannon Nelson
<shannon.nelson@oracle.com> wrote:
> On 12/5/2017 9:30 AM, Alexander Duyck wrote:
>>
>> On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
>> <shannon.nelson@oracle.com> wrote:
>>>
>>> On a chip reset most of the table contents are lost, so must be
>>> restored.  This scans the driver's ipsec tables and restores both
>>> the filled and empty table slots to their pre-reset values.
>>>
>>> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
>>> ---
>>>   drivers/net/ethernet/intel/ixgbe/ixgbe.h       |  2 +
>>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 53
>>> ++++++++++++++++++++++++++
>>>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  |  1 +
>>>   3 files changed, 56 insertions(+)
>>>
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>> index 9487750..7e8bca7 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>> @@ -1009,7 +1009,9 @@ s32 ixgbe_negotiate_fc(struct ixgbe_hw *hw, u32
>>> adv_reg, u32 lp_reg,
>>>                         u32 adv_sym, u32 adv_asm, u32 lp_sym, u32
>>> lp_asm);
>>>   #ifdef CONFIG_XFRM_OFFLOAD
>>>   void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter);
>>> +void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter);
>>>   #else
>>>   static inline void ixgbe_init_ipsec_offload(struct ixgbe_adapter
>>> *adapter) { };
>>> +static inline void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter) {
>>> };
>>>   #endif /* CONFIG_XFRM_OFFLOAD */
>>>   #endif /* _IXGBE_H_ */
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>> index 7b01d92..b93ee7f 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>> @@ -292,6 +292,59 @@ static void ixgbe_ipsec_start_engine(struct
>>> ixgbe_adapter *adapter)
>>>   }
>>>
>>>   /**
>>> + * ixgbe_ipsec_restore - restore the ipsec HW settings after a reset
>>> + * @adapter: board private structure
>>> + **/
>>> +void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter)
>>> +{
>>> +       struct ixgbe_ipsec *ipsec = adapter->ipsec;
>>> +       struct ixgbe_hw *hw = &adapter->hw;
>>> +       u32 zbuf[4] = {0, 0, 0, 0};
>>
>>
>> zbuf should be a static const.
>
>
> Yep
>
>>
>>> +       int i;
>>> +
>>> +       if (!(adapter->flags2 & IXGBE_FLAG2_IPSEC_ENABLED))
>>> +               return;
>>> +
>>> +       /* clean up the engine settings */
>>> +       ixgbe_ipsec_stop_engine(adapter);
>>> +
>>> +       /* start the engine */
>>> +       ixgbe_ipsec_start_engine(adapter);
>>> +
>>> +       /* reload the IP addrs */
>>> +       for (i = 0; i < IXGBE_IPSEC_MAX_RX_IP_COUNT; i++) {
>>> +               struct rx_ip_sa *ipsa = &ipsec->ip_tbl[i];
>>> +
>>> +               if (ipsa->used)
>>> +                       ixgbe_ipsec_set_rx_ip(hw, i, ipsa->ipaddr);
>>> +               else
>>> +                       ixgbe_ipsec_set_rx_ip(hw, i, zbuf);
>>
>>
>> If we are doing a restore do we actually need to write the zero
>> values? If we did a reset I thought you had a function that was going
>> through and zeroing everything out. If so this now becomes redundant.
>
>
> Currently ixgbe_ipsec_clear_hw_tables() only gets run at at probe.  It
> should probably get run at remove as well.  Doing this is a bit of safety
> paranoia, and making sure the CAM memory structures that don't get cleared
> on reset have exactly what I expect in them.

You might just move ixgbe_ipsec_clear_hw_tables into the rest logic
itself. Then it covers all cases where you would be resetting the
hardware and expecting a consistent state. It will mean writing some
registers twice during the reset but it is probably better just to
make certain everything stays in a known good state after a reset.

>>
>>> +       }
>>> +
>>> +       /* reload the Rx keys */
>>> +       for (i = 0; i < IXGBE_IPSEC_MAX_SA_COUNT; i++) {
>>> +               struct rx_sa *rsa = &ipsec->rx_tbl[i];
>>> +
>>> +               if (rsa->used)
>>> +                       ixgbe_ipsec_set_rx_sa(hw, i, rsa->xs->id.spi,
>>> +                                             rsa->key, rsa->salt,
>>> +                                             rsa->mode, rsa->iptbl_ind);
>>> +               else
>>> +                       ixgbe_ipsec_set_rx_sa(hw, i, 0, zbuf, 0, 0, 0);
>>
>>
>> same here
>>
>>> +       }
>>> +
>>> +       /* reload the Tx keys */
>>> +       for (i = 0; i < IXGBE_IPSEC_MAX_SA_COUNT; i++) {
>>> +               struct tx_sa *tsa = &ipsec->tx_tbl[i];
>>> +
>>> +               if (tsa->used)
>>> +                       ixgbe_ipsec_set_tx_sa(hw, i, tsa->key,
>>> tsa->salt);
>>> +               else
>>> +                       ixgbe_ipsec_set_tx_sa(hw, i, zbuf, 0);
>>
>>
>> and here
>>
>>> +       }
>>> +}
>>> +
>>> +/**
>>>    * ixgbe_ipsec_find_empty_idx - find the first unused security
>>> parameter index
>>>    * @ipsec: pointer to ipsec struct
>>>    * @rxtable: true if we need to look in the Rx table
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> index 01fd89b..6eabf92 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> @@ -5347,6 +5347,7 @@ static void ixgbe_configure(struct ixgbe_adapter
>>> *adapter)
>>>
>>>          ixgbe_set_rx_mode(adapter->netdev);
>>>          ixgbe_restore_vlan(adapter);
>>> +       ixgbe_ipsec_restore(adapter);
>>>
>>>          switch (hw->mac.type) {
>>>          case ixgbe_mac_82599EB:
>>> --
>>> 2.7.4
>>>
>>> _______________________________________________
>>> Intel-wired-lan mailing list
>>> Intel-wired-lan at osuosl.org
>>> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [Intel-wired-lan] [next-queue 07/10] ixgbe: process the Rx ipsec offload
  2017-12-07  5:43       ` Shannon Nelson
@ 2017-12-07 17:20         ` Alexander Duyck
  -1 siblings, 0 replies; 78+ messages in thread
From: Alexander Duyck @ 2017-12-07 17:20 UTC (permalink / raw)
  To: Shannon Nelson
  Cc: intel-wired-lan, Jeff Kirsher, Steffen Klassert,
	Sowmini Varadhan, Netdev

On Wed, Dec 6, 2017 at 9:43 PM, Shannon Nelson
<shannon.nelson@oracle.com> wrote:
> On 12/5/2017 9:40 AM, Alexander Duyck wrote:
>>
>> On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
>> <shannon.nelson@oracle.com> wrote:
>>>
>>> If the chip sees and decrypts an ipsec offload, set up the skb
>>> sp pointer with the ralated SA info.  Since the chip is rude
>>> enough to keep to itself the table index it used for the
>>> decryption, we have to do our own table lookup, using the
>>> hash for speed.
>>>
>>> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
>>> ---
>>>   drivers/net/ethernet/intel/ixgbe/ixgbe.h       |  6 ++
>>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 89
>>> ++++++++++++++++++++++++++
>>>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  |  3 +
>>>   3 files changed, 98 insertions(+)
>>>
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>> index 7e8bca7..77f07dc 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>> @@ -1009,9 +1009,15 @@ s32 ixgbe_negotiate_fc(struct ixgbe_hw *hw, u32
>>> adv_reg, u32 lp_reg,
>>>                         u32 adv_sym, u32 adv_asm, u32 lp_sym, u32
>>> lp_asm);
>>>   #ifdef CONFIG_XFRM_OFFLOAD
>>>   void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter);
>>> +void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
>>> +                   union ixgbe_adv_rx_desc *rx_desc,
>>> +                   struct sk_buff *skb);
>>>   void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter);
>>>   #else
>>>   static inline void ixgbe_init_ipsec_offload(struct ixgbe_adapter
>>> *adapter) { };
>>> +static inline void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
>>> +                                 union ixgbe_adv_rx_desc *rx_desc,
>>> +                                 struct sk_buff *skb) { };
>>>   static inline void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter) {
>>> };
>>>   #endif /* CONFIG_XFRM_OFFLOAD */
>>>   #endif /* _IXGBE_H_ */
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>> index b93ee7f..fd06d9b 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>> @@ -379,6 +379,35 @@ static int ixgbe_ipsec_find_empty_idx(struct
>>> ixgbe_ipsec *ipsec, bool rxtable)
>>>   }
>>>
>>>   /**
>>> + * ixgbe_ipsec_find_rx_state - find the state that matches
>>> + * @ipsec: pointer to ipsec struct
>>> + * @daddr: inbound address to match
>>> + * @proto: protocol to match
>>> + * @spi: SPI to match
>>> + *
>>> + * Returns a pointer to the matching SA state information
>>> + **/
>>> +static struct xfrm_state *ixgbe_ipsec_find_rx_state(struct ixgbe_ipsec
>>> *ipsec,
>>> +                                                   __be32 daddr, u8
>>> proto,
>>> +                                                   __be32 spi)
>>> +{
>>> +       struct rx_sa *rsa;
>>> +       struct xfrm_state *ret = NULL;
>>> +
>>> +       rcu_read_lock();
>>> +       hash_for_each_possible_rcu(ipsec->rx_sa_list, rsa, hlist, spi)
>>> +               if (spi == rsa->xs->id.spi &&
>>> +                   daddr == rsa->xs->id.daddr.a4 &&
>>> +                   proto == rsa->xs->id.proto) {
>>> +                       ret = rsa->xs;
>>> +                       xfrm_state_hold(ret);
>>> +                       break;
>>> +               }
>>> +       rcu_read_unlock();
>>> +       return ret;
>>> +}
>>> +
>>
>>
>> You need to choose a bucket, not just walk through all buckets.
>
>
> I may be wrong, but I believe that is what is happening here, where the spi
> is the hash key.  As the function description says "iterate over all
> possible objects hashing to the same bucket".  Besides, I basically cribbed
> this directly from our Mellanox friends (thanks!).

You're right. I misread that.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 07/10] ixgbe: process the Rx ipsec offload
@ 2017-12-07 17:20         ` Alexander Duyck
  0 siblings, 0 replies; 78+ messages in thread
From: Alexander Duyck @ 2017-12-07 17:20 UTC (permalink / raw)
  To: intel-wired-lan

On Wed, Dec 6, 2017 at 9:43 PM, Shannon Nelson
<shannon.nelson@oracle.com> wrote:
> On 12/5/2017 9:40 AM, Alexander Duyck wrote:
>>
>> On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
>> <shannon.nelson@oracle.com> wrote:
>>>
>>> If the chip sees and decrypts an ipsec offload, set up the skb
>>> sp pointer with the ralated SA info.  Since the chip is rude
>>> enough to keep to itself the table index it used for the
>>> decryption, we have to do our own table lookup, using the
>>> hash for speed.
>>>
>>> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
>>> ---
>>>   drivers/net/ethernet/intel/ixgbe/ixgbe.h       |  6 ++
>>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 89
>>> ++++++++++++++++++++++++++
>>>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  |  3 +
>>>   3 files changed, 98 insertions(+)
>>>
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>> index 7e8bca7..77f07dc 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>> @@ -1009,9 +1009,15 @@ s32 ixgbe_negotiate_fc(struct ixgbe_hw *hw, u32
>>> adv_reg, u32 lp_reg,
>>>                         u32 adv_sym, u32 adv_asm, u32 lp_sym, u32
>>> lp_asm);
>>>   #ifdef CONFIG_XFRM_OFFLOAD
>>>   void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter);
>>> +void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
>>> +                   union ixgbe_adv_rx_desc *rx_desc,
>>> +                   struct sk_buff *skb);
>>>   void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter);
>>>   #else
>>>   static inline void ixgbe_init_ipsec_offload(struct ixgbe_adapter
>>> *adapter) { };
>>> +static inline void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
>>> +                                 union ixgbe_adv_rx_desc *rx_desc,
>>> +                                 struct sk_buff *skb) { };
>>>   static inline void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter) {
>>> };
>>>   #endif /* CONFIG_XFRM_OFFLOAD */
>>>   #endif /* _IXGBE_H_ */
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>> index b93ee7f..fd06d9b 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>> @@ -379,6 +379,35 @@ static int ixgbe_ipsec_find_empty_idx(struct
>>> ixgbe_ipsec *ipsec, bool rxtable)
>>>   }
>>>
>>>   /**
>>> + * ixgbe_ipsec_find_rx_state - find the state that matches
>>> + * @ipsec: pointer to ipsec struct
>>> + * @daddr: inbound address to match
>>> + * @proto: protocol to match
>>> + * @spi: SPI to match
>>> + *
>>> + * Returns a pointer to the matching SA state information
>>> + **/
>>> +static struct xfrm_state *ixgbe_ipsec_find_rx_state(struct ixgbe_ipsec
>>> *ipsec,
>>> +                                                   __be32 daddr, u8
>>> proto,
>>> +                                                   __be32 spi)
>>> +{
>>> +       struct rx_sa *rsa;
>>> +       struct xfrm_state *ret = NULL;
>>> +
>>> +       rcu_read_lock();
>>> +       hash_for_each_possible_rcu(ipsec->rx_sa_list, rsa, hlist, spi)
>>> +               if (spi == rsa->xs->id.spi &&
>>> +                   daddr == rsa->xs->id.daddr.a4 &&
>>> +                   proto == rsa->xs->id.proto) {
>>> +                       ret = rsa->xs;
>>> +                       xfrm_state_hold(ret);
>>> +                       break;
>>> +               }
>>> +       rcu_read_unlock();
>>> +       return ret;
>>> +}
>>> +
>>
>>
>> You need to choose a bucket, not just walk through all buckets.
>
>
> I may be wrong, but I believe that is what is happening here, where the spi
> is the hash key.  As the function description says "iterate over all
> possible objects hashing to the same bucket".  Besides, I basically cribbed
> this directly from our Mellanox friends (thanks!).

You're right. I misread that.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [Intel-wired-lan] [next-queue 08/10] ixgbe: process the Tx ipsec offload
  2017-12-07  5:43       ` Shannon Nelson
@ 2017-12-07 17:56         ` Alexander Duyck
  -1 siblings, 0 replies; 78+ messages in thread
From: Alexander Duyck @ 2017-12-07 17:56 UTC (permalink / raw)
  To: Shannon Nelson
  Cc: intel-wired-lan, Jeff Kirsher, Steffen Klassert,
	Sowmini Varadhan, Netdev

On Wed, Dec 6, 2017 at 9:43 PM, Shannon Nelson
<shannon.nelson@oracle.com> wrote:
> On 12/5/2017 10:13 AM, Alexander Duyck wrote:
>>
>> On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
>> <shannon.nelson@oracle.com> wrote:
>>>
>>> If the skb has a security association referenced in the skb, then
>>> set up the Tx descriptor with the ipsec offload bits.  While we're
>>> here, we fix an oddly named field in the context descriptor struct.
>>>
>>> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
>>> ---
>>>   drivers/net/ethernet/intel/ixgbe/ixgbe.h       | 10 +++-
>>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 77
>>> ++++++++++++++++++++++++++
>>>   drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c   |  4 +-
>>>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  | 38 ++++++++++---
>>>   drivers/net/ethernet/intel/ixgbe/ixgbe_type.h  |  2 +-
>>>   5 files changed, 118 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>> index 77f07dc..68097fe 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>> @@ -171,10 +171,11 @@ enum ixgbe_tx_flags {
>>>          IXGBE_TX_FLAGS_CC       = 0x08,
>>>          IXGBE_TX_FLAGS_IPV4     = 0x10,
>>>          IXGBE_TX_FLAGS_CSUM     = 0x20,
>>> +       IXGBE_TX_FLAGS_IPSEC    = 0x40,
>>>
>>>          /* software defined flags */
>>> -       IXGBE_TX_FLAGS_SW_VLAN  = 0x40,
>>> -       IXGBE_TX_FLAGS_FCOE     = 0x80,
>>> +       IXGBE_TX_FLAGS_SW_VLAN  = 0x80,
>>> +       IXGBE_TX_FLAGS_FCOE     = 0x100,
>>>   };
>>>
>>>   /* VLAN info */
>>> @@ -1012,12 +1013,17 @@ void ixgbe_init_ipsec_offload(struct
>>> ixgbe_adapter *adapter);
>>>   void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
>>>                      union ixgbe_adv_rx_desc *rx_desc,
>>>                      struct sk_buff *skb);
>>> +int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring, struct sk_buff *skb,
>>> +                  __be16 protocol, struct ixgbe_ipsec_tx_data *itd);
>>>   void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter);
>>>   #else
>>>   static inline void ixgbe_init_ipsec_offload(struct ixgbe_adapter
>>> *adapter) { };
>>>   static inline void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
>>>                                    union ixgbe_adv_rx_desc *rx_desc,
>>>                                    struct sk_buff *skb) { };
>>> +static inline int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring,
>>> +                                struct sk_buff *skb, __be16 protocol,
>>> +                                struct ixgbe_ipsec_tx_data *itd) {
>>> return 0; };
>>>   static inline void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter) {
>>> };
>>>   #endif /* CONFIG_XFRM_OFFLOAD */
>>>   #endif /* _IXGBE_H_ */
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>> index fd06d9b..2a0dd7a 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>> @@ -703,12 +703,89 @@ static void ixgbe_ipsec_del_sa(struct xfrm_state
>>> *xs)
>>>          }
>>>   }
>>>
>>> +/**
>>> + * ixgbe_ipsec_offload_ok - can this packet use the xfrm hw offload
>>> + * @skb: current data packet
>>> + * @xs: pointer to transformer state struct
>>> + **/
>>> +static bool ixgbe_ipsec_offload_ok(struct sk_buff *skb, struct
>>> xfrm_state *xs)
>>> +{
>>> +       if (xs->props.family == AF_INET) {
>>> +               /* Offload with IPv4 options is not supported yet */
>>> +               if (ip_hdr(skb)->ihl > 5)
>>
>>
>> I would make this ihl != 5 instead of "> 5" since smaller values would
>> be invalid as well.
>
>
> Sure
>
>
>>
>>> +                       return false;
>>> +       } else {
>>> +               /* Offload with IPv6 extension headers is not support yet
>>> */
>>> +               if (ipv6_ext_hdr(ipv6_hdr(skb)->nexthdr))
>>> +                       return false;
>>> +       }
>>> +
>>> +       return true;
>>> +}
>>> +
>>>   static const struct xfrmdev_ops ixgbe_xfrmdev_ops = {
>>>          .xdo_dev_state_add = ixgbe_ipsec_add_sa,
>>>          .xdo_dev_state_delete = ixgbe_ipsec_del_sa,
>>> +       .xdo_dev_offload_ok = ixgbe_ipsec_offload_ok,
>>>   };
>>>
>>>   /**
>>> + * ixgbe_ipsec_tx - setup Tx flags for ipsec offload
>>> + * @tx_ring: outgoing context
>>> + * @skb: current data packet
>>> + * @protocol: network protocol
>>> + * @itd: ipsec Tx data for later use in building context descriptor
>>> + **/
>>> +int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring, struct sk_buff *skb,
>>> +                  __be16 protocol, struct ixgbe_ipsec_tx_data *itd)
>>> +{
>>> +       struct ixgbe_adapter *adapter = netdev_priv(tx_ring->netdev);
>>> +       struct ixgbe_ipsec *ipsec = adapter->ipsec;
>>> +       struct xfrm_state *xs;
>>> +       struct tx_sa *tsa;
>>> +
>>> +       if (!skb->sp->len) {
>>> +               netdev_err(tx_ring->netdev, "%s: no xfrm state len =
>>> %d\n",
>>> +                          __func__, skb->sp->len);
>>> +               return 0;
>>> +       }
>>> +
>>> +       xs = xfrm_input_state(skb);
>>> +       if (!xs) {
>>> +               netdev_err(tx_ring->netdev, "%s: no xfrm_input_state() xs
>>> = %p\n",
>>> +                          __func__, xs);
>>> +               return 0;
>>> +       }
>>> +
>>> +       itd->sa_idx = xs->xso.offload_handle - IXGBE_IPSEC_BASE_TX_INDEX;
>>> +       if (itd->sa_idx > IXGBE_IPSEC_MAX_SA_COUNT) {
>>> +               netdev_err(tx_ring->netdev, "%s: bad sa_idx=%d
>>> handle=%lu\n",
>>> +                          __func__, itd->sa_idx,
>>> xs->xso.offload_handle);
>>> +               return 0;
>>> +       }
>>> +
>>> +       tsa = &ipsec->tx_tbl[itd->sa_idx];
>>> +       if (!tsa->used) {
>>> +               netdev_err(tx_ring->netdev, "%s: unused sa_idx=%d\n",
>>> +                          __func__, itd->sa_idx);
>>> +               return 0;
>>> +       }
>>> +
>>> +       itd->flags = 0;
>>> +       if (xs->id.proto == IPPROTO_ESP) {
>>> +               itd->flags |= IXGBE_ADVTXD_TUCMD_IPSEC_TYPE_ESP |
>>> +                             IXGBE_ADVTXD_TUCMD_L4T_TCP;
>>
>>
>> Why is the TCP value being set here? This doesn't seem correct either.
>> This implies TCP a TCP offload. It seems like this should only be
>> setting ESP.
>
>
> Honestly?  Because when I was testing that, it didn't work without it. This
> was one of the things I was going to come back to when I started working on
> the csum and tso support.

We might want to try testing with that dropped to see if we need it or
not. I would suspect not since I would imagine this would cause bad
things for non-TCP traffic. Also the inner L4 header shouldn't matter
unless you are trying to offload it.

>>
>>> +               if (protocol == htons(ETH_P_IP))
>>> +                       itd->flags |= IXGBE_ADVTXD_TUCMD_IPV4;
>>
>>
>> Does the IPsec offload need to know if the frame is v4 or v6? I'm just
>> wondering if it does or not.
>
>
> Yes, I believe this is how it knows how much header to skip to find the ESP
> header.  However, I'll test that and see if it can come out.

Like I mentioned last time it might be better to have this handled in
ixgbe_tx_csum. If it is harmless we can probably just include it
there. We should be able to do it in the block after the no_csum
label. I'd be curious if not doing this up until now might have other
effects such as impacting RSS since I know the whole reason for us
having to do the CC stuff anyway was to actually get header split to
work correctly with PF/VF loopback packets. It wouldn't surprise me if
setting these fields defines the packet type received on the other
end.

>> If not then this probably isn't needed.
>> One thought on this line is you might look at moving it into
>> ixgbe_tx_csum. If setting the bit is harmless without setting IXSM we
>> might look at moving it into the end of ixgbe_tx_csum and just make it
>> compare against first->protocol there.
>
>
>>
>>> +               itd->trailer_len = xs->props.trailer_len;
>>> +       }
>>> +       if (tsa->encrypt)
>>> +               itd->flags |= IXGBE_ADVTXD_TUCMD_IPSEC_ENCRYPT_EN;
>>> +
>>> +       return 1;
>>> +}
>>> +
>>> +/**
>>>    * ixgbe_ipsec_rx - decode ipsec bits from Rx descriptor
>>>    * @rx_ring: receiving ring
>>>    * @rx_desc: receive data descriptor
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
>>> index f1bfae0..d7875b3 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
>>> @@ -1261,7 +1261,7 @@ void ixgbe_clear_interrupt_scheme(struct
>>> ixgbe_adapter *adapter)
>>>   }
>>>
>>>   void ixgbe_tx_ctxtdesc(struct ixgbe_ring *tx_ring, u32 vlan_macip_lens,
>>> -                      u32 fcoe_sof_eof, u32 type_tucmd, u32
>>> mss_l4len_idx)
>>> +                      u32 fceof_saidx, u32 type_tucmd, u32
>>> mss_l4len_idx)
>>>   {
>>>          struct ixgbe_adv_tx_context_desc *context_desc;
>>>          u16 i = tx_ring->next_to_use;
>>> @@ -1275,7 +1275,7 @@ void ixgbe_tx_ctxtdesc(struct ixgbe_ring *tx_ring,
>>> u32 vlan_macip_lens,
>>>          type_tucmd |= IXGBE_TXD_CMD_DEXT | IXGBE_ADVTXD_DTYP_CTXT;
>>>
>>>          context_desc->vlan_macip_lens   = cpu_to_le32(vlan_macip_lens);
>>> -       context_desc->seqnum_seed       = cpu_to_le32(fcoe_sof_eof);
>>> +       context_desc->fceof_saidx       = cpu_to_le32(fceof_saidx);
>>>          context_desc->type_tucmd_mlhl   = cpu_to_le32(type_tucmd);
>>>          context_desc->mss_l4len_idx     = cpu_to_le32(mss_l4len_idx);
>>>   }
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> index 60f9f2d..c857594 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> @@ -7659,9 +7659,10 @@ static void ixgbe_service_task(struct work_struct
>>> *work)
>>>
>>>   static int ixgbe_tso(struct ixgbe_ring *tx_ring,
>>>                       struct ixgbe_tx_buffer *first,
>>> -                    u8 *hdr_len)
>>> +                    u8 *hdr_len,
>>> +                    struct ixgbe_ipsec_tx_data *itd)
>>>   {
>>> -       u32 vlan_macip_lens, type_tucmd, mss_l4len_idx;
>>> +       u32 vlan_macip_lens, type_tucmd, mss_l4len_idx, fceof_saidx = 0;
>>>          struct sk_buff *skb = first->skb;
>>>          union {
>>>                  struct iphdr *v4;
>>> @@ -7740,7 +7741,12 @@ static int ixgbe_tso(struct ixgbe_ring *tx_ring,
>>>          vlan_macip_lens |= (ip.hdr - skb->data) <<
>>> IXGBE_ADVTXD_MACLEN_SHIFT;
>>>          vlan_macip_lens |= first->tx_flags & IXGBE_TX_FLAGS_VLAN_MASK;
>>>
>>> -       ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, 0, type_tucmd,
>>> +       if (first->tx_flags & IXGBE_TX_FLAGS_IPSEC) {
>>> +               fceof_saidx |= itd->sa_idx;
>>> +               type_tucmd |= itd->flags | itd->trailer_len;
>>> +       }

So just a thought. Why bother with the TX_FLAGS_CHECK at all? It seems
like in the case that the flag isn't set you would have itd->sa_idx
equal to 0 anyway so it would still be the same result wouldn't it? It
would save you from having to zero both fceof_saidx and itd->sa_idx
since you could just pass itd->sa_idx and save yourself the extra
variable.

Also if flags and trailer_len are both being written to the same
location why not combine them in your structure into one single 32 bit
entry? It would allow you to essentially reduce this to one OR and you
could just pass itd->sa_idx directly which should be a pretty
significant savings in terms of instructions and cycles. Also you
might want to consider bumping itd->sa_idx up to a 32b value. It will
possibly cost you a cycle or so to convert the 16b value to a 32b
value before writing it. If you merge the flags and trailer length you
should have the space to spare to bump up the size.

>>> +
>>> +       ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, fceof_saidx,
>>> type_tucmd,
>>>                            mss_l4len_idx);
>>>
>>>          return 1;
>>> @@ -7756,10 +7762,12 @@ static inline bool ixgbe_ipv6_csum_is_sctp(struct
>>> sk_buff *skb)
>>>   }
>>>
>>>   static void ixgbe_tx_csum(struct ixgbe_ring *tx_ring,
>>> -                         struct ixgbe_tx_buffer *first)
>>> +                         struct ixgbe_tx_buffer *first,
>>> +                         struct ixgbe_ipsec_tx_data *itd)
>>>   {
>>>          struct sk_buff *skb = first->skb;
>>>          u32 vlan_macip_lens = 0;
>>> +       u32 fceof_saidx = 0;
>>>          u32 type_tucmd = 0;
>>>
>>>          if (skb->ip_summed != CHECKSUM_PARTIAL) {
>>> @@ -7800,7 +7808,12 @@ static void ixgbe_tx_csum(struct ixgbe_ring
>>> *tx_ring,
>>>          vlan_macip_lens |= skb_network_offset(skb) <<
>>> IXGBE_ADVTXD_MACLEN_SHIFT;
>>>          vlan_macip_lens |= first->tx_flags & IXGBE_TX_FLAGS_VLAN_MASK;
>>>
>>> -       ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, 0, type_tucmd, 0);
>>> +       if (first->tx_flags & IXGBE_TX_FLAGS_IPSEC) {
>>> +               fceof_saidx |= itd->sa_idx;
>>> +               type_tucmd |= itd->flags | itd->trailer_len;
>>> +       }
>>> +
>>> +       ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, fceof_saidx,
>>> type_tucmd, 0);
>>>   }
>>>
>>>   #define IXGBE_SET_FLAG(_input, _flag, _result) \
>>> @@ -7843,11 +7856,16 @@ static void ixgbe_tx_olinfo_status(union
>>> ixgbe_adv_tx_desc *tx_desc,
>>>                                          IXGBE_TX_FLAGS_CSUM,
>>>                                          IXGBE_ADVTXD_POPTS_TXSM);
>>>
>>> -       /* enble IPv4 checksum for TSO */
>>> +       /* enable IPv4 checksum for TSO */
>>>          olinfo_status |= IXGBE_SET_FLAG(tx_flags,
>>>                                          IXGBE_TX_FLAGS_IPV4,
>>>                                          IXGBE_ADVTXD_POPTS_IXSM);
>>>
>>> +       /* enable IPsec */
>>> +       olinfo_status |= IXGBE_SET_FLAG(tx_flags,
>>> +                                       IXGBE_TX_FLAGS_IPSEC,
>>> +                                       IXGBE_ADVTXD_POPTS_IPSEC);
>>> +
>>>          /*
>>>           * Check Context must be set if Tx switch is enabled, which it
>>>           * always is for case where virtual functions are running
>>> @@ -8306,6 +8324,7 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff
>>> *skb,
>>>          u32 tx_flags = 0;
>>>          unsigned short f;
>>>          u16 count = TXD_USE_COUNT(skb_headlen(skb));
>>> +       struct ixgbe_ipsec_tx_data ipsec_tx = { 0 };
>>>          __be16 protocol = skb->protocol;
>>>          u8 hdr_len = 0;
>>>
>>> @@ -8394,6 +8413,9 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff
>>> *skb,
>>>                  }
>>>          }
>>>
>>> +       if (skb->sp && ixgbe_ipsec_tx(tx_ring, skb, protocol, &ipsec_tx))
>>> +               tx_flags |= IXGBE_TX_FLAGS_IPSEC | IXGBE_TX_FLAGS_CC;
>>
>>
>> You might just want to pull the skb->sp check into ixgbe_ipsec_tx and
>> could pass tx_flags as a part of the first buffer. It doesn't really
>> matter anyway as most of this will just be inlined so it will all end
>> up a part of the same function anyway.
>
>
> Since the function is defined in a different .o file, are you sure it will
> get inlined?  I put the skb->sp check here to make sure we don't do an
> unnecessary jump.

You're right. I forgot you are defining this in a different file.

Still I would like to see this moved down though. Where it is at
doesn't really flow with everything else since FCoE and this aren't
likely to ever interact so I would rather us check for FCoE and then
get into the IPsec logic.

>>
>> Also I would move this down so that it is handled after the fields in
>> the first buffer_info structure are set. Then this can ll just fall
>> inline with the TSO block and get handled there.
>>
>>> +
>>>          /* record initial flags and protocol */
>>>          first->tx_flags = tx_flags;
>>>          first->protocol = protocol;
>>> @@ -8410,11 +8432,11 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff
>>> *skb,
>>>          }
>>>
>>>   #endif /* IXGBE_FCOE */
>>
>>
>> So if you move the function down here it will help to avoid any other
>> complication. In addition you could follow the same logic that we do
>> for ixgbe_tso/fso so you could drop the frame instead of transmitting
>> it if it is requesting a bad offload.
>
>
> Sure
>
> sln
>
>
>>
>>> -       tso = ixgbe_tso(tx_ring, first, &hdr_len);
>>> +       tso = ixgbe_tso(tx_ring, first, &hdr_len, &ipsec_tx);
>>>          if (tso < 0)
>>>                  goto out_drop;
>>>          else if (!tso)
>>> -               ixgbe_tx_csum(tx_ring, first);
>>> +               ixgbe_tx_csum(tx_ring, first, &ipsec_tx);
>>>
>>>          /* add the ATR filter if ATR is on */
>>>          if (test_bit(__IXGBE_TX_FDIR_INIT_DONE, &tx_ring->state))
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
>>> index 3df0763..0ac725fa 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
>>> @@ -2856,7 +2856,7 @@ union ixgbe_adv_rx_desc {
>>>   /* Context descriptors */
>>>   struct ixgbe_adv_tx_context_desc {
>>>          __le32 vlan_macip_lens;
>>> -       __le32 seqnum_seed;
>>> +       __le32 fceof_saidx;
>>>          __le32 type_tucmd_mlhl;
>>>          __le32 mss_l4len_idx;
>>>   };
>>> --
>>> 2.7.4
>>>
>>> _______________________________________________
>>> Intel-wired-lan mailing list
>>> Intel-wired-lan@osuosl.org
>>> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 08/10] ixgbe: process the Tx ipsec offload
@ 2017-12-07 17:56         ` Alexander Duyck
  0 siblings, 0 replies; 78+ messages in thread
From: Alexander Duyck @ 2017-12-07 17:56 UTC (permalink / raw)
  To: intel-wired-lan

On Wed, Dec 6, 2017 at 9:43 PM, Shannon Nelson
<shannon.nelson@oracle.com> wrote:
> On 12/5/2017 10:13 AM, Alexander Duyck wrote:
>>
>> On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
>> <shannon.nelson@oracle.com> wrote:
>>>
>>> If the skb has a security association referenced in the skb, then
>>> set up the Tx descriptor with the ipsec offload bits.  While we're
>>> here, we fix an oddly named field in the context descriptor struct.
>>>
>>> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
>>> ---
>>>   drivers/net/ethernet/intel/ixgbe/ixgbe.h       | 10 +++-
>>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 77
>>> ++++++++++++++++++++++++++
>>>   drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c   |  4 +-
>>>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  | 38 ++++++++++---
>>>   drivers/net/ethernet/intel/ixgbe/ixgbe_type.h  |  2 +-
>>>   5 files changed, 118 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>> index 77f07dc..68097fe 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>> @@ -171,10 +171,11 @@ enum ixgbe_tx_flags {
>>>          IXGBE_TX_FLAGS_CC       = 0x08,
>>>          IXGBE_TX_FLAGS_IPV4     = 0x10,
>>>          IXGBE_TX_FLAGS_CSUM     = 0x20,
>>> +       IXGBE_TX_FLAGS_IPSEC    = 0x40,
>>>
>>>          /* software defined flags */
>>> -       IXGBE_TX_FLAGS_SW_VLAN  = 0x40,
>>> -       IXGBE_TX_FLAGS_FCOE     = 0x80,
>>> +       IXGBE_TX_FLAGS_SW_VLAN  = 0x80,
>>> +       IXGBE_TX_FLAGS_FCOE     = 0x100,
>>>   };
>>>
>>>   /* VLAN info */
>>> @@ -1012,12 +1013,17 @@ void ixgbe_init_ipsec_offload(struct
>>> ixgbe_adapter *adapter);
>>>   void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
>>>                      union ixgbe_adv_rx_desc *rx_desc,
>>>                      struct sk_buff *skb);
>>> +int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring, struct sk_buff *skb,
>>> +                  __be16 protocol, struct ixgbe_ipsec_tx_data *itd);
>>>   void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter);
>>>   #else
>>>   static inline void ixgbe_init_ipsec_offload(struct ixgbe_adapter
>>> *adapter) { };
>>>   static inline void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
>>>                                    union ixgbe_adv_rx_desc *rx_desc,
>>>                                    struct sk_buff *skb) { };
>>> +static inline int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring,
>>> +                                struct sk_buff *skb, __be16 protocol,
>>> +                                struct ixgbe_ipsec_tx_data *itd) {
>>> return 0; };
>>>   static inline void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter) {
>>> };
>>>   #endif /* CONFIG_XFRM_OFFLOAD */
>>>   #endif /* _IXGBE_H_ */
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>> index fd06d9b..2a0dd7a 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>> @@ -703,12 +703,89 @@ static void ixgbe_ipsec_del_sa(struct xfrm_state
>>> *xs)
>>>          }
>>>   }
>>>
>>> +/**
>>> + * ixgbe_ipsec_offload_ok - can this packet use the xfrm hw offload
>>> + * @skb: current data packet
>>> + * @xs: pointer to transformer state struct
>>> + **/
>>> +static bool ixgbe_ipsec_offload_ok(struct sk_buff *skb, struct
>>> xfrm_state *xs)
>>> +{
>>> +       if (xs->props.family == AF_INET) {
>>> +               /* Offload with IPv4 options is not supported yet */
>>> +               if (ip_hdr(skb)->ihl > 5)
>>
>>
>> I would make this ihl != 5 instead of "> 5" since smaller values would
>> be invalid as well.
>
>
> Sure
>
>
>>
>>> +                       return false;
>>> +       } else {
>>> +               /* Offload with IPv6 extension headers is not support yet
>>> */
>>> +               if (ipv6_ext_hdr(ipv6_hdr(skb)->nexthdr))
>>> +                       return false;
>>> +       }
>>> +
>>> +       return true;
>>> +}
>>> +
>>>   static const struct xfrmdev_ops ixgbe_xfrmdev_ops = {
>>>          .xdo_dev_state_add = ixgbe_ipsec_add_sa,
>>>          .xdo_dev_state_delete = ixgbe_ipsec_del_sa,
>>> +       .xdo_dev_offload_ok = ixgbe_ipsec_offload_ok,
>>>   };
>>>
>>>   /**
>>> + * ixgbe_ipsec_tx - setup Tx flags for ipsec offload
>>> + * @tx_ring: outgoing context
>>> + * @skb: current data packet
>>> + * @protocol: network protocol
>>> + * @itd: ipsec Tx data for later use in building context descriptor
>>> + **/
>>> +int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring, struct sk_buff *skb,
>>> +                  __be16 protocol, struct ixgbe_ipsec_tx_data *itd)
>>> +{
>>> +       struct ixgbe_adapter *adapter = netdev_priv(tx_ring->netdev);
>>> +       struct ixgbe_ipsec *ipsec = adapter->ipsec;
>>> +       struct xfrm_state *xs;
>>> +       struct tx_sa *tsa;
>>> +
>>> +       if (!skb->sp->len) {
>>> +               netdev_err(tx_ring->netdev, "%s: no xfrm state len =
>>> %d\n",
>>> +                          __func__, skb->sp->len);
>>> +               return 0;
>>> +       }
>>> +
>>> +       xs = xfrm_input_state(skb);
>>> +       if (!xs) {
>>> +               netdev_err(tx_ring->netdev, "%s: no xfrm_input_state() xs
>>> = %p\n",
>>> +                          __func__, xs);
>>> +               return 0;
>>> +       }
>>> +
>>> +       itd->sa_idx = xs->xso.offload_handle - IXGBE_IPSEC_BASE_TX_INDEX;
>>> +       if (itd->sa_idx > IXGBE_IPSEC_MAX_SA_COUNT) {
>>> +               netdev_err(tx_ring->netdev, "%s: bad sa_idx=%d
>>> handle=%lu\n",
>>> +                          __func__, itd->sa_idx,
>>> xs->xso.offload_handle);
>>> +               return 0;
>>> +       }
>>> +
>>> +       tsa = &ipsec->tx_tbl[itd->sa_idx];
>>> +       if (!tsa->used) {
>>> +               netdev_err(tx_ring->netdev, "%s: unused sa_idx=%d\n",
>>> +                          __func__, itd->sa_idx);
>>> +               return 0;
>>> +       }
>>> +
>>> +       itd->flags = 0;
>>> +       if (xs->id.proto == IPPROTO_ESP) {
>>> +               itd->flags |= IXGBE_ADVTXD_TUCMD_IPSEC_TYPE_ESP |
>>> +                             IXGBE_ADVTXD_TUCMD_L4T_TCP;
>>
>>
>> Why is the TCP value being set here? This doesn't seem correct either.
>> This implies TCP a TCP offload. It seems like this should only be
>> setting ESP.
>
>
> Honestly?  Because when I was testing that, it didn't work without it. This
> was one of the things I was going to come back to when I started working on
> the csum and tso support.

We might want to try testing with that dropped to see if we need it or
not. I would suspect not since I would imagine this would cause bad
things for non-TCP traffic. Also the inner L4 header shouldn't matter
unless you are trying to offload it.

>>
>>> +               if (protocol == htons(ETH_P_IP))
>>> +                       itd->flags |= IXGBE_ADVTXD_TUCMD_IPV4;
>>
>>
>> Does the IPsec offload need to know if the frame is v4 or v6? I'm just
>> wondering if it does or not.
>
>
> Yes, I believe this is how it knows how much header to skip to find the ESP
> header.  However, I'll test that and see if it can come out.

Like I mentioned last time it might be better to have this handled in
ixgbe_tx_csum. If it is harmless we can probably just include it
there. We should be able to do it in the block after the no_csum
label. I'd be curious if not doing this up until now might have other
effects such as impacting RSS since I know the whole reason for us
having to do the CC stuff anyway was to actually get header split to
work correctly with PF/VF loopback packets. It wouldn't surprise me if
setting these fields defines the packet type received on the other
end.

>> If not then this probably isn't needed.
>> One thought on this line is you might look at moving it into
>> ixgbe_tx_csum. If setting the bit is harmless without setting IXSM we
>> might look at moving it into the end of ixgbe_tx_csum and just make it
>> compare against first->protocol there.
>
>
>>
>>> +               itd->trailer_len = xs->props.trailer_len;
>>> +       }
>>> +       if (tsa->encrypt)
>>> +               itd->flags |= IXGBE_ADVTXD_TUCMD_IPSEC_ENCRYPT_EN;
>>> +
>>> +       return 1;
>>> +}
>>> +
>>> +/**
>>>    * ixgbe_ipsec_rx - decode ipsec bits from Rx descriptor
>>>    * @rx_ring: receiving ring
>>>    * @rx_desc: receive data descriptor
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
>>> index f1bfae0..d7875b3 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
>>> @@ -1261,7 +1261,7 @@ void ixgbe_clear_interrupt_scheme(struct
>>> ixgbe_adapter *adapter)
>>>   }
>>>
>>>   void ixgbe_tx_ctxtdesc(struct ixgbe_ring *tx_ring, u32 vlan_macip_lens,
>>> -                      u32 fcoe_sof_eof, u32 type_tucmd, u32
>>> mss_l4len_idx)
>>> +                      u32 fceof_saidx, u32 type_tucmd, u32
>>> mss_l4len_idx)
>>>   {
>>>          struct ixgbe_adv_tx_context_desc *context_desc;
>>>          u16 i = tx_ring->next_to_use;
>>> @@ -1275,7 +1275,7 @@ void ixgbe_tx_ctxtdesc(struct ixgbe_ring *tx_ring,
>>> u32 vlan_macip_lens,
>>>          type_tucmd |= IXGBE_TXD_CMD_DEXT | IXGBE_ADVTXD_DTYP_CTXT;
>>>
>>>          context_desc->vlan_macip_lens   = cpu_to_le32(vlan_macip_lens);
>>> -       context_desc->seqnum_seed       = cpu_to_le32(fcoe_sof_eof);
>>> +       context_desc->fceof_saidx       = cpu_to_le32(fceof_saidx);
>>>          context_desc->type_tucmd_mlhl   = cpu_to_le32(type_tucmd);
>>>          context_desc->mss_l4len_idx     = cpu_to_le32(mss_l4len_idx);
>>>   }
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> index 60f9f2d..c857594 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> @@ -7659,9 +7659,10 @@ static void ixgbe_service_task(struct work_struct
>>> *work)
>>>
>>>   static int ixgbe_tso(struct ixgbe_ring *tx_ring,
>>>                       struct ixgbe_tx_buffer *first,
>>> -                    u8 *hdr_len)
>>> +                    u8 *hdr_len,
>>> +                    struct ixgbe_ipsec_tx_data *itd)
>>>   {
>>> -       u32 vlan_macip_lens, type_tucmd, mss_l4len_idx;
>>> +       u32 vlan_macip_lens, type_tucmd, mss_l4len_idx, fceof_saidx = 0;
>>>          struct sk_buff *skb = first->skb;
>>>          union {
>>>                  struct iphdr *v4;
>>> @@ -7740,7 +7741,12 @@ static int ixgbe_tso(struct ixgbe_ring *tx_ring,
>>>          vlan_macip_lens |= (ip.hdr - skb->data) <<
>>> IXGBE_ADVTXD_MACLEN_SHIFT;
>>>          vlan_macip_lens |= first->tx_flags & IXGBE_TX_FLAGS_VLAN_MASK;
>>>
>>> -       ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, 0, type_tucmd,
>>> +       if (first->tx_flags & IXGBE_TX_FLAGS_IPSEC) {
>>> +               fceof_saidx |= itd->sa_idx;
>>> +               type_tucmd |= itd->flags | itd->trailer_len;
>>> +       }

So just a thought. Why bother with the TX_FLAGS_CHECK at all? It seems
like in the case that the flag isn't set you would have itd->sa_idx
equal to 0 anyway so it would still be the same result wouldn't it? It
would save you from having to zero both fceof_saidx and itd->sa_idx
since you could just pass itd->sa_idx and save yourself the extra
variable.

Also if flags and trailer_len are both being written to the same
location why not combine them in your structure into one single 32 bit
entry? It would allow you to essentially reduce this to one OR and you
could just pass itd->sa_idx directly which should be a pretty
significant savings in terms of instructions and cycles. Also you
might want to consider bumping itd->sa_idx up to a 32b value. It will
possibly cost you a cycle or so to convert the 16b value to a 32b
value before writing it. If you merge the flags and trailer length you
should have the space to spare to bump up the size.

>>> +
>>> +       ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, fceof_saidx,
>>> type_tucmd,
>>>                            mss_l4len_idx);
>>>
>>>          return 1;
>>> @@ -7756,10 +7762,12 @@ static inline bool ixgbe_ipv6_csum_is_sctp(struct
>>> sk_buff *skb)
>>>   }
>>>
>>>   static void ixgbe_tx_csum(struct ixgbe_ring *tx_ring,
>>> -                         struct ixgbe_tx_buffer *first)
>>> +                         struct ixgbe_tx_buffer *first,
>>> +                         struct ixgbe_ipsec_tx_data *itd)
>>>   {
>>>          struct sk_buff *skb = first->skb;
>>>          u32 vlan_macip_lens = 0;
>>> +       u32 fceof_saidx = 0;
>>>          u32 type_tucmd = 0;
>>>
>>>          if (skb->ip_summed != CHECKSUM_PARTIAL) {
>>> @@ -7800,7 +7808,12 @@ static void ixgbe_tx_csum(struct ixgbe_ring
>>> *tx_ring,
>>>          vlan_macip_lens |= skb_network_offset(skb) <<
>>> IXGBE_ADVTXD_MACLEN_SHIFT;
>>>          vlan_macip_lens |= first->tx_flags & IXGBE_TX_FLAGS_VLAN_MASK;
>>>
>>> -       ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, 0, type_tucmd, 0);
>>> +       if (first->tx_flags & IXGBE_TX_FLAGS_IPSEC) {
>>> +               fceof_saidx |= itd->sa_idx;
>>> +               type_tucmd |= itd->flags | itd->trailer_len;
>>> +       }
>>> +
>>> +       ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, fceof_saidx,
>>> type_tucmd, 0);
>>>   }
>>>
>>>   #define IXGBE_SET_FLAG(_input, _flag, _result) \
>>> @@ -7843,11 +7856,16 @@ static void ixgbe_tx_olinfo_status(union
>>> ixgbe_adv_tx_desc *tx_desc,
>>>                                          IXGBE_TX_FLAGS_CSUM,
>>>                                          IXGBE_ADVTXD_POPTS_TXSM);
>>>
>>> -       /* enble IPv4 checksum for TSO */
>>> +       /* enable IPv4 checksum for TSO */
>>>          olinfo_status |= IXGBE_SET_FLAG(tx_flags,
>>>                                          IXGBE_TX_FLAGS_IPV4,
>>>                                          IXGBE_ADVTXD_POPTS_IXSM);
>>>
>>> +       /* enable IPsec */
>>> +       olinfo_status |= IXGBE_SET_FLAG(tx_flags,
>>> +                                       IXGBE_TX_FLAGS_IPSEC,
>>> +                                       IXGBE_ADVTXD_POPTS_IPSEC);
>>> +
>>>          /*
>>>           * Check Context must be set if Tx switch is enabled, which it
>>>           * always is for case where virtual functions are running
>>> @@ -8306,6 +8324,7 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff
>>> *skb,
>>>          u32 tx_flags = 0;
>>>          unsigned short f;
>>>          u16 count = TXD_USE_COUNT(skb_headlen(skb));
>>> +       struct ixgbe_ipsec_tx_data ipsec_tx = { 0 };
>>>          __be16 protocol = skb->protocol;
>>>          u8 hdr_len = 0;
>>>
>>> @@ -8394,6 +8413,9 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff
>>> *skb,
>>>                  }
>>>          }
>>>
>>> +       if (skb->sp && ixgbe_ipsec_tx(tx_ring, skb, protocol, &ipsec_tx))
>>> +               tx_flags |= IXGBE_TX_FLAGS_IPSEC | IXGBE_TX_FLAGS_CC;
>>
>>
>> You might just want to pull the skb->sp check into ixgbe_ipsec_tx and
>> could pass tx_flags as a part of the first buffer. It doesn't really
>> matter anyway as most of this will just be inlined so it will all end
>> up a part of the same function anyway.
>
>
> Since the function is defined in a different .o file, are you sure it will
> get inlined?  I put the skb->sp check here to make sure we don't do an
> unnecessary jump.

You're right. I forgot you are defining this in a different file.

Still I would like to see this moved down though. Where it is at
doesn't really flow with everything else since FCoE and this aren't
likely to ever interact so I would rather us check for FCoE and then
get into the IPsec logic.

>>
>> Also I would move this down so that it is handled after the fields in
>> the first buffer_info structure are set. Then this can ll just fall
>> inline with the TSO block and get handled there.
>>
>>> +
>>>          /* record initial flags and protocol */
>>>          first->tx_flags = tx_flags;
>>>          first->protocol = protocol;
>>> @@ -8410,11 +8432,11 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff
>>> *skb,
>>>          }
>>>
>>>   #endif /* IXGBE_FCOE */
>>
>>
>> So if you move the function down here it will help to avoid any other
>> complication. In addition you could follow the same logic that we do
>> for ixgbe_tso/fso so you could drop the frame instead of transmitting
>> it if it is requesting a bad offload.
>
>
> Sure
>
> sln
>
>
>>
>>> -       tso = ixgbe_tso(tx_ring, first, &hdr_len);
>>> +       tso = ixgbe_tso(tx_ring, first, &hdr_len, &ipsec_tx);
>>>          if (tso < 0)
>>>                  goto out_drop;
>>>          else if (!tso)
>>> -               ixgbe_tx_csum(tx_ring, first);
>>> +               ixgbe_tx_csum(tx_ring, first, &ipsec_tx);
>>>
>>>          /* add the ATR filter if ATR is on */
>>>          if (test_bit(__IXGBE_TX_FDIR_INIT_DONE, &tx_ring->state))
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
>>> index 3df0763..0ac725fa 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
>>> @@ -2856,7 +2856,7 @@ union ixgbe_adv_rx_desc {
>>>   /* Context descriptors */
>>>   struct ixgbe_adv_tx_context_desc {
>>>          __le32 vlan_macip_lens;
>>> -       __le32 seqnum_seed;
>>> +       __le32 fceof_saidx;
>>>          __le32 type_tucmd_mlhl;
>>>          __le32 mss_l4len_idx;
>>>   };
>>> --
>>> 2.7.4
>>>
>>> _______________________________________________
>>> Intel-wired-lan mailing list
>>> Intel-wired-lan at osuosl.org
>>> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [Intel-wired-lan] [next-queue 06/10] ixgbe: restore offloaded SAs after a reset
  2017-12-07 17:16         ` Alexander Duyck
@ 2017-12-07 18:47           ` Shannon Nelson
  -1 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-07 18:47 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: intel-wired-lan, Jeff Kirsher, Steffen Klassert,
	Sowmini Varadhan, Netdev

On 12/7/2017 9:16 AM, Alexander Duyck wrote:
> On Wed, Dec 6, 2017 at 9:43 PM, Shannon Nelson
> <shannon.nelson@oracle.com> wrote:
>> On 12/5/2017 9:30 AM, Alexander Duyck wrote:
>>>
>>> On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
>>> <shannon.nelson@oracle.com> wrote:
>>>>
>>>> On a chip reset most of the table contents are lost, so must be

<snip>

>>>> +       /* reload the IP addrs */
>>>> +       for (i = 0; i < IXGBE_IPSEC_MAX_RX_IP_COUNT; i++) {
>>>> +               struct rx_ip_sa *ipsa = &ipsec->ip_tbl[i];
>>>> +
>>>> +               if (ipsa->used)
>>>> +                       ixgbe_ipsec_set_rx_ip(hw, i, ipsa->ipaddr);
>>>> +               else
>>>> +                       ixgbe_ipsec_set_rx_ip(hw, i, zbuf);
>>>
>>>
>>> If we are doing a restore do we actually need to write the zero
>>> values? If we did a reset I thought you had a function that was going
>>> through and zeroing everything out. If so this now becomes redundant.
>>
>>
>> Currently ixgbe_ipsec_clear_hw_tables() only gets run at at probe.  It
>> should probably get run at remove as well.  Doing this is a bit of safety
>> paranoia, and making sure the CAM memory structures that don't get cleared
>> on reset have exactly what I expect in them.
> 
> You might just move ixgbe_ipsec_clear_hw_tables into the rest logic
> itself. Then it covers all cases where you would be resetting the
> hardware and expecting a consistent state. It will mean writing some
> registers twice during the reset but it is probably better just to
> make certain everything stays in a known good state after a reset.

If it is a small number, e.g. 10 or 20, then you may be right.  However, 
given we have table space for 2k different SAs, at 6 writes per Tx SA 
and 10 writes per Rx SA, plus 128 IP address with 4 writes each, we are 
already looking at 17K writes already to be sure the tables are clean.

Unfortunately, I don't really know what a "typical" case will be, so I 
don't know how many SA we may be offloading at any one time.  But in a 
busy cloud support server, we might have nearly full tables.  If we do 
the full clean first, then have to fill all the tables, we're now 
looking at up to 35k writes slowing down the reset process.

I'd rather keep it to the constant 17K writes for now, and look later at 
using the VALID bit in the IPSRXMOD to see if we can at least cut down 
on the Rx writes.

sln

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 06/10] ixgbe: restore offloaded SAs after a reset
@ 2017-12-07 18:47           ` Shannon Nelson
  0 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-07 18:47 UTC (permalink / raw)
  To: intel-wired-lan

On 12/7/2017 9:16 AM, Alexander Duyck wrote:
> On Wed, Dec 6, 2017 at 9:43 PM, Shannon Nelson
> <shannon.nelson@oracle.com> wrote:
>> On 12/5/2017 9:30 AM, Alexander Duyck wrote:
>>>
>>> On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
>>> <shannon.nelson@oracle.com> wrote:
>>>>
>>>> On a chip reset most of the table contents are lost, so must be

<snip>

>>>> +       /* reload the IP addrs */
>>>> +       for (i = 0; i < IXGBE_IPSEC_MAX_RX_IP_COUNT; i++) {
>>>> +               struct rx_ip_sa *ipsa = &ipsec->ip_tbl[i];
>>>> +
>>>> +               if (ipsa->used)
>>>> +                       ixgbe_ipsec_set_rx_ip(hw, i, ipsa->ipaddr);
>>>> +               else
>>>> +                       ixgbe_ipsec_set_rx_ip(hw, i, zbuf);
>>>
>>>
>>> If we are doing a restore do we actually need to write the zero
>>> values? If we did a reset I thought you had a function that was going
>>> through and zeroing everything out. If so this now becomes redundant.
>>
>>
>> Currently ixgbe_ipsec_clear_hw_tables() only gets run at at probe.  It
>> should probably get run at remove as well.  Doing this is a bit of safety
>> paranoia, and making sure the CAM memory structures that don't get cleared
>> on reset have exactly what I expect in them.
> 
> You might just move ixgbe_ipsec_clear_hw_tables into the rest logic
> itself. Then it covers all cases where you would be resetting the
> hardware and expecting a consistent state. It will mean writing some
> registers twice during the reset but it is probably better just to
> make certain everything stays in a known good state after a reset.

If it is a small number, e.g. 10 or 20, then you may be right.  However, 
given we have table space for 2k different SAs, at 6 writes per Tx SA 
and 10 writes per Rx SA, plus 128 IP address with 4 writes each, we are 
already looking at 17K writes already to be sure the tables are clean.

Unfortunately, I don't really know what a "typical" case will be, so I 
don't know how many SA we may be offloading at any one time.  But in a 
busy cloud support server, we might have nearly full tables.  If we do 
the full clean first, then have to fill all the tables, we're now 
looking at up to 35k writes slowing down the reset process.

I'd rather keep it to the constant 17K writes for now, and look later at 
using the VALID bit in the IPSRXMOD to see if we can at least cut down 
on the Rx writes.

sln

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [Intel-wired-lan] [next-queue 08/10] ixgbe: process the Tx ipsec offload
  2017-12-07 17:56         ` Alexander Duyck
@ 2017-12-07 18:50           ` Shannon Nelson
  -1 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-07 18:50 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: intel-wired-lan, Jeff Kirsher, Steffen Klassert,
	Sowmini Varadhan, Netdev

On 12/7/2017 9:56 AM, Alexander Duyck wrote:

You've suggested several things here, all good things to look into, 
which I will do, most now, some in the near future.

Thanks!
sln

> On Wed, Dec 6, 2017 at 9:43 PM, Shannon Nelson
> <shannon.nelson@oracle.com> wrote:
>> On 12/5/2017 10:13 AM, Alexander Duyck wrote:
>>>
>>> On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
>>> <shannon.nelson@oracle.com> wrote:
>>>>
>>>> If the skb has a security association referenced in the skb, then
>>>> set up the Tx descriptor with the ipsec offload bits.  While we're
>>>> here, we fix an oddly named field in the context descriptor struct.
>>>>
>>>> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
>>>> ---
>>>>    drivers/net/ethernet/intel/ixgbe/ixgbe.h       | 10 +++-
>>>>    drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 77
>>>> ++++++++++++++++++++++++++
>>>>    drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c   |  4 +-
>>>>    drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  | 38 ++++++++++---
>>>>    drivers/net/ethernet/intel/ixgbe/ixgbe_type.h  |  2 +-
>>>>    5 files changed, 118 insertions(+), 13 deletions(-)
>>>>
>>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>>> index 77f07dc..68097fe 100644
>>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>>> @@ -171,10 +171,11 @@ enum ixgbe_tx_flags {
>>>>           IXGBE_TX_FLAGS_CC       = 0x08,
>>>>           IXGBE_TX_FLAGS_IPV4     = 0x10,
>>>>           IXGBE_TX_FLAGS_CSUM     = 0x20,
>>>> +       IXGBE_TX_FLAGS_IPSEC    = 0x40,
>>>>
>>>>           /* software defined flags */
>>>> -       IXGBE_TX_FLAGS_SW_VLAN  = 0x40,
>>>> -       IXGBE_TX_FLAGS_FCOE     = 0x80,
>>>> +       IXGBE_TX_FLAGS_SW_VLAN  = 0x80,
>>>> +       IXGBE_TX_FLAGS_FCOE     = 0x100,
>>>>    };
>>>>
>>>>    /* VLAN info */
>>>> @@ -1012,12 +1013,17 @@ void ixgbe_init_ipsec_offload(struct
>>>> ixgbe_adapter *adapter);
>>>>    void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
>>>>                       union ixgbe_adv_rx_desc *rx_desc,
>>>>                       struct sk_buff *skb);
>>>> +int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring, struct sk_buff *skb,
>>>> +                  __be16 protocol, struct ixgbe_ipsec_tx_data *itd);
>>>>    void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter);
>>>>    #else
>>>>    static inline void ixgbe_init_ipsec_offload(struct ixgbe_adapter
>>>> *adapter) { };
>>>>    static inline void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
>>>>                                     union ixgbe_adv_rx_desc *rx_desc,
>>>>                                     struct sk_buff *skb) { };
>>>> +static inline int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring,
>>>> +                                struct sk_buff *skb, __be16 protocol,
>>>> +                                struct ixgbe_ipsec_tx_data *itd) {
>>>> return 0; };
>>>>    static inline void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter) {
>>>> };
>>>>    #endif /* CONFIG_XFRM_OFFLOAD */
>>>>    #endif /* _IXGBE_H_ */
>>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>>> index fd06d9b..2a0dd7a 100644
>>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>>> @@ -703,12 +703,89 @@ static void ixgbe_ipsec_del_sa(struct xfrm_state
>>>> *xs)
>>>>           }
>>>>    }
>>>>
>>>> +/**
>>>> + * ixgbe_ipsec_offload_ok - can this packet use the xfrm hw offload
>>>> + * @skb: current data packet
>>>> + * @xs: pointer to transformer state struct
>>>> + **/
>>>> +static bool ixgbe_ipsec_offload_ok(struct sk_buff *skb, struct
>>>> xfrm_state *xs)
>>>> +{
>>>> +       if (xs->props.family == AF_INET) {
>>>> +               /* Offload with IPv4 options is not supported yet */
>>>> +               if (ip_hdr(skb)->ihl > 5)
>>>
>>>
>>> I would make this ihl != 5 instead of "> 5" since smaller values would
>>> be invalid as well.
>>
>>
>> Sure
>>
>>
>>>
>>>> +                       return false;
>>>> +       } else {
>>>> +               /* Offload with IPv6 extension headers is not support yet
>>>> */
>>>> +               if (ipv6_ext_hdr(ipv6_hdr(skb)->nexthdr))
>>>> +                       return false;
>>>> +       }
>>>> +
>>>> +       return true;
>>>> +}
>>>> +
>>>>    static const struct xfrmdev_ops ixgbe_xfrmdev_ops = {
>>>>           .xdo_dev_state_add = ixgbe_ipsec_add_sa,
>>>>           .xdo_dev_state_delete = ixgbe_ipsec_del_sa,
>>>> +       .xdo_dev_offload_ok = ixgbe_ipsec_offload_ok,
>>>>    };
>>>>
>>>>    /**
>>>> + * ixgbe_ipsec_tx - setup Tx flags for ipsec offload
>>>> + * @tx_ring: outgoing context
>>>> + * @skb: current data packet
>>>> + * @protocol: network protocol
>>>> + * @itd: ipsec Tx data for later use in building context descriptor
>>>> + **/
>>>> +int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring, struct sk_buff *skb,
>>>> +                  __be16 protocol, struct ixgbe_ipsec_tx_data *itd)
>>>> +{
>>>> +       struct ixgbe_adapter *adapter = netdev_priv(tx_ring->netdev);
>>>> +       struct ixgbe_ipsec *ipsec = adapter->ipsec;
>>>> +       struct xfrm_state *xs;
>>>> +       struct tx_sa *tsa;
>>>> +
>>>> +       if (!skb->sp->len) {
>>>> +               netdev_err(tx_ring->netdev, "%s: no xfrm state len =
>>>> %d\n",
>>>> +                          __func__, skb->sp->len);
>>>> +               return 0;
>>>> +       }
>>>> +
>>>> +       xs = xfrm_input_state(skb);
>>>> +       if (!xs) {
>>>> +               netdev_err(tx_ring->netdev, "%s: no xfrm_input_state() xs
>>>> = %p\n",
>>>> +                          __func__, xs);
>>>> +               return 0;
>>>> +       }
>>>> +
>>>> +       itd->sa_idx = xs->xso.offload_handle - IXGBE_IPSEC_BASE_TX_INDEX;
>>>> +       if (itd->sa_idx > IXGBE_IPSEC_MAX_SA_COUNT) {
>>>> +               netdev_err(tx_ring->netdev, "%s: bad sa_idx=%d
>>>> handle=%lu\n",
>>>> +                          __func__, itd->sa_idx,
>>>> xs->xso.offload_handle);
>>>> +               return 0;
>>>> +       }
>>>> +
>>>> +       tsa = &ipsec->tx_tbl[itd->sa_idx];
>>>> +       if (!tsa->used) {
>>>> +               netdev_err(tx_ring->netdev, "%s: unused sa_idx=%d\n",
>>>> +                          __func__, itd->sa_idx);
>>>> +               return 0;
>>>> +       }
>>>> +
>>>> +       itd->flags = 0;
>>>> +       if (xs->id.proto == IPPROTO_ESP) {
>>>> +               itd->flags |= IXGBE_ADVTXD_TUCMD_IPSEC_TYPE_ESP |
>>>> +                             IXGBE_ADVTXD_TUCMD_L4T_TCP;
>>>
>>>
>>> Why is the TCP value being set here? This doesn't seem correct either.
>>> This implies TCP a TCP offload. It seems like this should only be
>>> setting ESP.
>>
>>
>> Honestly?  Because when I was testing that, it didn't work without it. This
>> was one of the things I was going to come back to when I started working on
>> the csum and tso support.
> 
> We might want to try testing with that dropped to see if we need it or
> not. I would suspect not since I would imagine this would cause bad
> things for non-TCP traffic. Also the inner L4 header shouldn't matter
> unless you are trying to offload it.
> 
>>>
>>>> +               if (protocol == htons(ETH_P_IP))
>>>> +                       itd->flags |= IXGBE_ADVTXD_TUCMD_IPV4;
>>>
>>>
>>> Does the IPsec offload need to know if the frame is v4 or v6? I'm just
>>> wondering if it does or not.
>>
>>
>> Yes, I believe this is how it knows how much header to skip to find the ESP
>> header.  However, I'll test that and see if it can come out.
> 
> Like I mentioned last time it might be better to have this handled in
> ixgbe_tx_csum. If it is harmless we can probably just include it
> there. We should be able to do it in the block after the no_csum
> label. I'd be curious if not doing this up until now might have other
> effects such as impacting RSS since I know the whole reason for us
> having to do the CC stuff anyway was to actually get header split to
> work correctly with PF/VF loopback packets. It wouldn't surprise me if
> setting these fields defines the packet type received on the other
> end.
> 
>>> If not then this probably isn't needed.
>>> One thought on this line is you might look at moving it into
>>> ixgbe_tx_csum. If setting the bit is harmless without setting IXSM we
>>> might look at moving it into the end of ixgbe_tx_csum and just make it
>>> compare against first->protocol there.
>>
>>
>>>
>>>> +               itd->trailer_len = xs->props.trailer_len;
>>>> +       }
>>>> +       if (tsa->encrypt)
>>>> +               itd->flags |= IXGBE_ADVTXD_TUCMD_IPSEC_ENCRYPT_EN;
>>>> +
>>>> +       return 1;
>>>> +}
>>>> +
>>>> +/**
>>>>     * ixgbe_ipsec_rx - decode ipsec bits from Rx descriptor
>>>>     * @rx_ring: receiving ring
>>>>     * @rx_desc: receive data descriptor
>>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
>>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
>>>> index f1bfae0..d7875b3 100644
>>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
>>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
>>>> @@ -1261,7 +1261,7 @@ void ixgbe_clear_interrupt_scheme(struct
>>>> ixgbe_adapter *adapter)
>>>>    }
>>>>
>>>>    void ixgbe_tx_ctxtdesc(struct ixgbe_ring *tx_ring, u32 vlan_macip_lens,
>>>> -                      u32 fcoe_sof_eof, u32 type_tucmd, u32
>>>> mss_l4len_idx)
>>>> +                      u32 fceof_saidx, u32 type_tucmd, u32
>>>> mss_l4len_idx)
>>>>    {
>>>>           struct ixgbe_adv_tx_context_desc *context_desc;
>>>>           u16 i = tx_ring->next_to_use;
>>>> @@ -1275,7 +1275,7 @@ void ixgbe_tx_ctxtdesc(struct ixgbe_ring *tx_ring,
>>>> u32 vlan_macip_lens,
>>>>           type_tucmd |= IXGBE_TXD_CMD_DEXT | IXGBE_ADVTXD_DTYP_CTXT;
>>>>
>>>>           context_desc->vlan_macip_lens   = cpu_to_le32(vlan_macip_lens);
>>>> -       context_desc->seqnum_seed       = cpu_to_le32(fcoe_sof_eof);
>>>> +       context_desc->fceof_saidx       = cpu_to_le32(fceof_saidx);
>>>>           context_desc->type_tucmd_mlhl   = cpu_to_le32(type_tucmd);
>>>>           context_desc->mss_l4len_idx     = cpu_to_le32(mss_l4len_idx);
>>>>    }
>>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>>> index 60f9f2d..c857594 100644
>>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>>> @@ -7659,9 +7659,10 @@ static void ixgbe_service_task(struct work_struct
>>>> *work)
>>>>
>>>>    static int ixgbe_tso(struct ixgbe_ring *tx_ring,
>>>>                        struct ixgbe_tx_buffer *first,
>>>> -                    u8 *hdr_len)
>>>> +                    u8 *hdr_len,
>>>> +                    struct ixgbe_ipsec_tx_data *itd)
>>>>    {
>>>> -       u32 vlan_macip_lens, type_tucmd, mss_l4len_idx;
>>>> +       u32 vlan_macip_lens, type_tucmd, mss_l4len_idx, fceof_saidx = 0;
>>>>           struct sk_buff *skb = first->skb;
>>>>           union {
>>>>                   struct iphdr *v4;
>>>> @@ -7740,7 +7741,12 @@ static int ixgbe_tso(struct ixgbe_ring *tx_ring,
>>>>           vlan_macip_lens |= (ip.hdr - skb->data) <<
>>>> IXGBE_ADVTXD_MACLEN_SHIFT;
>>>>           vlan_macip_lens |= first->tx_flags & IXGBE_TX_FLAGS_VLAN_MASK;
>>>>
>>>> -       ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, 0, type_tucmd,
>>>> +       if (first->tx_flags & IXGBE_TX_FLAGS_IPSEC) {
>>>> +               fceof_saidx |= itd->sa_idx;
>>>> +               type_tucmd |= itd->flags | itd->trailer_len;
>>>> +       }
> 
> So just a thought. Why bother with the TX_FLAGS_CHECK at all? It seems
> like in the case that the flag isn't set you would have itd->sa_idx
> equal to 0 anyway so it would still be the same result wouldn't it? It
> would save you from having to zero both fceof_saidx and itd->sa_idx
> since you could just pass itd->sa_idx and save yourself the extra
> variable.
> 
> Also if flags and trailer_len are both being written to the same
> location why not combine them in your structure into one single 32 bit
> entry? It would allow you to essentially reduce this to one OR and you
> could just pass itd->sa_idx directly which should be a pretty
> significant savings in terms of instructions and cycles. Also you
> might want to consider bumping itd->sa_idx up to a 32b value. It will
> possibly cost you a cycle or so to convert the 16b value to a 32b
> value before writing it. If you merge the flags and trailer length you
> should have the space to spare to bump up the size.
> 
>>>> +
>>>> +       ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, fceof_saidx,
>>>> type_tucmd,
>>>>                             mss_l4len_idx);
>>>>
>>>>           return 1;
>>>> @@ -7756,10 +7762,12 @@ static inline bool ixgbe_ipv6_csum_is_sctp(struct
>>>> sk_buff *skb)
>>>>    }
>>>>
>>>>    static void ixgbe_tx_csum(struct ixgbe_ring *tx_ring,
>>>> -                         struct ixgbe_tx_buffer *first)
>>>> +                         struct ixgbe_tx_buffer *first,
>>>> +                         struct ixgbe_ipsec_tx_data *itd)
>>>>    {
>>>>           struct sk_buff *skb = first->skb;
>>>>           u32 vlan_macip_lens = 0;
>>>> +       u32 fceof_saidx = 0;
>>>>           u32 type_tucmd = 0;
>>>>
>>>>           if (skb->ip_summed != CHECKSUM_PARTIAL) {
>>>> @@ -7800,7 +7808,12 @@ static void ixgbe_tx_csum(struct ixgbe_ring
>>>> *tx_ring,
>>>>           vlan_macip_lens |= skb_network_offset(skb) <<
>>>> IXGBE_ADVTXD_MACLEN_SHIFT;
>>>>           vlan_macip_lens |= first->tx_flags & IXGBE_TX_FLAGS_VLAN_MASK;
>>>>
>>>> -       ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, 0, type_tucmd, 0);
>>>> +       if (first->tx_flags & IXGBE_TX_FLAGS_IPSEC) {
>>>> +               fceof_saidx |= itd->sa_idx;
>>>> +               type_tucmd |= itd->flags | itd->trailer_len;
>>>> +       }
>>>> +
>>>> +       ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, fceof_saidx,
>>>> type_tucmd, 0);
>>>>    }
>>>>
>>>>    #define IXGBE_SET_FLAG(_input, _flag, _result) \
>>>> @@ -7843,11 +7856,16 @@ static void ixgbe_tx_olinfo_status(union
>>>> ixgbe_adv_tx_desc *tx_desc,
>>>>                                           IXGBE_TX_FLAGS_CSUM,
>>>>                                           IXGBE_ADVTXD_POPTS_TXSM);
>>>>
>>>> -       /* enble IPv4 checksum for TSO */
>>>> +       /* enable IPv4 checksum for TSO */
>>>>           olinfo_status |= IXGBE_SET_FLAG(tx_flags,
>>>>                                           IXGBE_TX_FLAGS_IPV4,
>>>>                                           IXGBE_ADVTXD_POPTS_IXSM);
>>>>
>>>> +       /* enable IPsec */
>>>> +       olinfo_status |= IXGBE_SET_FLAG(tx_flags,
>>>> +                                       IXGBE_TX_FLAGS_IPSEC,
>>>> +                                       IXGBE_ADVTXD_POPTS_IPSEC);
>>>> +
>>>>           /*
>>>>            * Check Context must be set if Tx switch is enabled, which it
>>>>            * always is for case where virtual functions are running
>>>> @@ -8306,6 +8324,7 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff
>>>> *skb,
>>>>           u32 tx_flags = 0;
>>>>           unsigned short f;
>>>>           u16 count = TXD_USE_COUNT(skb_headlen(skb));
>>>> +       struct ixgbe_ipsec_tx_data ipsec_tx = { 0 };
>>>>           __be16 protocol = skb->protocol;
>>>>           u8 hdr_len = 0;
>>>>
>>>> @@ -8394,6 +8413,9 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff
>>>> *skb,
>>>>                   }
>>>>           }
>>>>
>>>> +       if (skb->sp && ixgbe_ipsec_tx(tx_ring, skb, protocol, &ipsec_tx))
>>>> +               tx_flags |= IXGBE_TX_FLAGS_IPSEC | IXGBE_TX_FLAGS_CC;
>>>
>>>
>>> You might just want to pull the skb->sp check into ixgbe_ipsec_tx and
>>> could pass tx_flags as a part of the first buffer. It doesn't really
>>> matter anyway as most of this will just be inlined so it will all end
>>> up a part of the same function anyway.
>>
>>
>> Since the function is defined in a different .o file, are you sure it will
>> get inlined?  I put the skb->sp check here to make sure we don't do an
>> unnecessary jump.
> 
> You're right. I forgot you are defining this in a different file.
> 
> Still I would like to see this moved down though. Where it is at
> doesn't really flow with everything else since FCoE and this aren't
> likely to ever interact so I would rather us check for FCoE and then
> get into the IPsec logic.
> 
>>>
>>> Also I would move this down so that it is handled after the fields in
>>> the first buffer_info structure are set. Then this can ll just fall
>>> inline with the TSO block and get handled there.
>>>
>>>> +
>>>>           /* record initial flags and protocol */
>>>>           first->tx_flags = tx_flags;
>>>>           first->protocol = protocol;
>>>> @@ -8410,11 +8432,11 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff
>>>> *skb,
>>>>           }
>>>>
>>>>    #endif /* IXGBE_FCOE */
>>>
>>>
>>> So if you move the function down here it will help to avoid any other
>>> complication. In addition you could follow the same logic that we do
>>> for ixgbe_tso/fso so you could drop the frame instead of transmitting
>>> it if it is requesting a bad offload.
>>
>>
>> Sure
>>
>> sln
>>
>>
>>>
>>>> -       tso = ixgbe_tso(tx_ring, first, &hdr_len);
>>>> +       tso = ixgbe_tso(tx_ring, first, &hdr_len, &ipsec_tx);
>>>>           if (tso < 0)
>>>>                   goto out_drop;
>>>>           else if (!tso)
>>>> -               ixgbe_tx_csum(tx_ring, first);
>>>> +               ixgbe_tx_csum(tx_ring, first, &ipsec_tx);
>>>>
>>>>           /* add the ATR filter if ATR is on */
>>>>           if (test_bit(__IXGBE_TX_FDIR_INIT_DONE, &tx_ring->state))
>>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
>>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
>>>> index 3df0763..0ac725fa 100644
>>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
>>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
>>>> @@ -2856,7 +2856,7 @@ union ixgbe_adv_rx_desc {
>>>>    /* Context descriptors */
>>>>    struct ixgbe_adv_tx_context_desc {
>>>>           __le32 vlan_macip_lens;
>>>> -       __le32 seqnum_seed;
>>>> +       __le32 fceof_saidx;
>>>>           __le32 type_tucmd_mlhl;
>>>>           __le32 mss_l4len_idx;
>>>>    };
>>>> --
>>>> 2.7.4
>>>>
>>>> _______________________________________________
>>>> Intel-wired-lan mailing list
>>>> Intel-wired-lan@osuosl.org
>>>> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 08/10] ixgbe: process the Tx ipsec offload
@ 2017-12-07 18:50           ` Shannon Nelson
  0 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-07 18:50 UTC (permalink / raw)
  To: intel-wired-lan

On 12/7/2017 9:56 AM, Alexander Duyck wrote:

You've suggested several things here, all good things to look into, 
which I will do, most now, some in the near future.

Thanks!
sln

> On Wed, Dec 6, 2017 at 9:43 PM, Shannon Nelson
> <shannon.nelson@oracle.com> wrote:
>> On 12/5/2017 10:13 AM, Alexander Duyck wrote:
>>>
>>> On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
>>> <shannon.nelson@oracle.com> wrote:
>>>>
>>>> If the skb has a security association referenced in the skb, then
>>>> set up the Tx descriptor with the ipsec offload bits.  While we're
>>>> here, we fix an oddly named field in the context descriptor struct.
>>>>
>>>> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
>>>> ---
>>>>    drivers/net/ethernet/intel/ixgbe/ixgbe.h       | 10 +++-
>>>>    drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 77
>>>> ++++++++++++++++++++++++++
>>>>    drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c   |  4 +-
>>>>    drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  | 38 ++++++++++---
>>>>    drivers/net/ethernet/intel/ixgbe/ixgbe_type.h  |  2 +-
>>>>    5 files changed, 118 insertions(+), 13 deletions(-)
>>>>
>>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>>> index 77f07dc..68097fe 100644
>>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>>>> @@ -171,10 +171,11 @@ enum ixgbe_tx_flags {
>>>>           IXGBE_TX_FLAGS_CC       = 0x08,
>>>>           IXGBE_TX_FLAGS_IPV4     = 0x10,
>>>>           IXGBE_TX_FLAGS_CSUM     = 0x20,
>>>> +       IXGBE_TX_FLAGS_IPSEC    = 0x40,
>>>>
>>>>           /* software defined flags */
>>>> -       IXGBE_TX_FLAGS_SW_VLAN  = 0x40,
>>>> -       IXGBE_TX_FLAGS_FCOE     = 0x80,
>>>> +       IXGBE_TX_FLAGS_SW_VLAN  = 0x80,
>>>> +       IXGBE_TX_FLAGS_FCOE     = 0x100,
>>>>    };
>>>>
>>>>    /* VLAN info */
>>>> @@ -1012,12 +1013,17 @@ void ixgbe_init_ipsec_offload(struct
>>>> ixgbe_adapter *adapter);
>>>>    void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
>>>>                       union ixgbe_adv_rx_desc *rx_desc,
>>>>                       struct sk_buff *skb);
>>>> +int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring, struct sk_buff *skb,
>>>> +                  __be16 protocol, struct ixgbe_ipsec_tx_data *itd);
>>>>    void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter);
>>>>    #else
>>>>    static inline void ixgbe_init_ipsec_offload(struct ixgbe_adapter
>>>> *adapter) { };
>>>>    static inline void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
>>>>                                     union ixgbe_adv_rx_desc *rx_desc,
>>>>                                     struct sk_buff *skb) { };
>>>> +static inline int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring,
>>>> +                                struct sk_buff *skb, __be16 protocol,
>>>> +                                struct ixgbe_ipsec_tx_data *itd) {
>>>> return 0; };
>>>>    static inline void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter) {
>>>> };
>>>>    #endif /* CONFIG_XFRM_OFFLOAD */
>>>>    #endif /* _IXGBE_H_ */
>>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>>> index fd06d9b..2a0dd7a 100644
>>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
>>>> @@ -703,12 +703,89 @@ static void ixgbe_ipsec_del_sa(struct xfrm_state
>>>> *xs)
>>>>           }
>>>>    }
>>>>
>>>> +/**
>>>> + * ixgbe_ipsec_offload_ok - can this packet use the xfrm hw offload
>>>> + * @skb: current data packet
>>>> + * @xs: pointer to transformer state struct
>>>> + **/
>>>> +static bool ixgbe_ipsec_offload_ok(struct sk_buff *skb, struct
>>>> xfrm_state *xs)
>>>> +{
>>>> +       if (xs->props.family == AF_INET) {
>>>> +               /* Offload with IPv4 options is not supported yet */
>>>> +               if (ip_hdr(skb)->ihl > 5)
>>>
>>>
>>> I would make this ihl != 5 instead of "> 5" since smaller values would
>>> be invalid as well.
>>
>>
>> Sure
>>
>>
>>>
>>>> +                       return false;
>>>> +       } else {
>>>> +               /* Offload with IPv6 extension headers is not support yet
>>>> */
>>>> +               if (ipv6_ext_hdr(ipv6_hdr(skb)->nexthdr))
>>>> +                       return false;
>>>> +       }
>>>> +
>>>> +       return true;
>>>> +}
>>>> +
>>>>    static const struct xfrmdev_ops ixgbe_xfrmdev_ops = {
>>>>           .xdo_dev_state_add = ixgbe_ipsec_add_sa,
>>>>           .xdo_dev_state_delete = ixgbe_ipsec_del_sa,
>>>> +       .xdo_dev_offload_ok = ixgbe_ipsec_offload_ok,
>>>>    };
>>>>
>>>>    /**
>>>> + * ixgbe_ipsec_tx - setup Tx flags for ipsec offload
>>>> + * @tx_ring: outgoing context
>>>> + * @skb: current data packet
>>>> + * @protocol: network protocol
>>>> + * @itd: ipsec Tx data for later use in building context descriptor
>>>> + **/
>>>> +int ixgbe_ipsec_tx(struct ixgbe_ring *tx_ring, struct sk_buff *skb,
>>>> +                  __be16 protocol, struct ixgbe_ipsec_tx_data *itd)
>>>> +{
>>>> +       struct ixgbe_adapter *adapter = netdev_priv(tx_ring->netdev);
>>>> +       struct ixgbe_ipsec *ipsec = adapter->ipsec;
>>>> +       struct xfrm_state *xs;
>>>> +       struct tx_sa *tsa;
>>>> +
>>>> +       if (!skb->sp->len) {
>>>> +               netdev_err(tx_ring->netdev, "%s: no xfrm state len =
>>>> %d\n",
>>>> +                          __func__, skb->sp->len);
>>>> +               return 0;
>>>> +       }
>>>> +
>>>> +       xs = xfrm_input_state(skb);
>>>> +       if (!xs) {
>>>> +               netdev_err(tx_ring->netdev, "%s: no xfrm_input_state() xs
>>>> = %p\n",
>>>> +                          __func__, xs);
>>>> +               return 0;
>>>> +       }
>>>> +
>>>> +       itd->sa_idx = xs->xso.offload_handle - IXGBE_IPSEC_BASE_TX_INDEX;
>>>> +       if (itd->sa_idx > IXGBE_IPSEC_MAX_SA_COUNT) {
>>>> +               netdev_err(tx_ring->netdev, "%s: bad sa_idx=%d
>>>> handle=%lu\n",
>>>> +                          __func__, itd->sa_idx,
>>>> xs->xso.offload_handle);
>>>> +               return 0;
>>>> +       }
>>>> +
>>>> +       tsa = &ipsec->tx_tbl[itd->sa_idx];
>>>> +       if (!tsa->used) {
>>>> +               netdev_err(tx_ring->netdev, "%s: unused sa_idx=%d\n",
>>>> +                          __func__, itd->sa_idx);
>>>> +               return 0;
>>>> +       }
>>>> +
>>>> +       itd->flags = 0;
>>>> +       if (xs->id.proto == IPPROTO_ESP) {
>>>> +               itd->flags |= IXGBE_ADVTXD_TUCMD_IPSEC_TYPE_ESP |
>>>> +                             IXGBE_ADVTXD_TUCMD_L4T_TCP;
>>>
>>>
>>> Why is the TCP value being set here? This doesn't seem correct either.
>>> This implies TCP a TCP offload. It seems like this should only be
>>> setting ESP.
>>
>>
>> Honestly?  Because when I was testing that, it didn't work without it. This
>> was one of the things I was going to come back to when I started working on
>> the csum and tso support.
> 
> We might want to try testing with that dropped to see if we need it or
> not. I would suspect not since I would imagine this would cause bad
> things for non-TCP traffic. Also the inner L4 header shouldn't matter
> unless you are trying to offload it.
> 
>>>
>>>> +               if (protocol == htons(ETH_P_IP))
>>>> +                       itd->flags |= IXGBE_ADVTXD_TUCMD_IPV4;
>>>
>>>
>>> Does the IPsec offload need to know if the frame is v4 or v6? I'm just
>>> wondering if it does or not.
>>
>>
>> Yes, I believe this is how it knows how much header to skip to find the ESP
>> header.  However, I'll test that and see if it can come out.
> 
> Like I mentioned last time it might be better to have this handled in
> ixgbe_tx_csum. If it is harmless we can probably just include it
> there. We should be able to do it in the block after the no_csum
> label. I'd be curious if not doing this up until now might have other
> effects such as impacting RSS since I know the whole reason for us
> having to do the CC stuff anyway was to actually get header split to
> work correctly with PF/VF loopback packets. It wouldn't surprise me if
> setting these fields defines the packet type received on the other
> end.
> 
>>> If not then this probably isn't needed.
>>> One thought on this line is you might look at moving it into
>>> ixgbe_tx_csum. If setting the bit is harmless without setting IXSM we
>>> might look at moving it into the end of ixgbe_tx_csum and just make it
>>> compare against first->protocol there.
>>
>>
>>>
>>>> +               itd->trailer_len = xs->props.trailer_len;
>>>> +       }
>>>> +       if (tsa->encrypt)
>>>> +               itd->flags |= IXGBE_ADVTXD_TUCMD_IPSEC_ENCRYPT_EN;
>>>> +
>>>> +       return 1;
>>>> +}
>>>> +
>>>> +/**
>>>>     * ixgbe_ipsec_rx - decode ipsec bits from Rx descriptor
>>>>     * @rx_ring: receiving ring
>>>>     * @rx_desc: receive data descriptor
>>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
>>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
>>>> index f1bfae0..d7875b3 100644
>>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
>>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
>>>> @@ -1261,7 +1261,7 @@ void ixgbe_clear_interrupt_scheme(struct
>>>> ixgbe_adapter *adapter)
>>>>    }
>>>>
>>>>    void ixgbe_tx_ctxtdesc(struct ixgbe_ring *tx_ring, u32 vlan_macip_lens,
>>>> -                      u32 fcoe_sof_eof, u32 type_tucmd, u32
>>>> mss_l4len_idx)
>>>> +                      u32 fceof_saidx, u32 type_tucmd, u32
>>>> mss_l4len_idx)
>>>>    {
>>>>           struct ixgbe_adv_tx_context_desc *context_desc;
>>>>           u16 i = tx_ring->next_to_use;
>>>> @@ -1275,7 +1275,7 @@ void ixgbe_tx_ctxtdesc(struct ixgbe_ring *tx_ring,
>>>> u32 vlan_macip_lens,
>>>>           type_tucmd |= IXGBE_TXD_CMD_DEXT | IXGBE_ADVTXD_DTYP_CTXT;
>>>>
>>>>           context_desc->vlan_macip_lens   = cpu_to_le32(vlan_macip_lens);
>>>> -       context_desc->seqnum_seed       = cpu_to_le32(fcoe_sof_eof);
>>>> +       context_desc->fceof_saidx       = cpu_to_le32(fceof_saidx);
>>>>           context_desc->type_tucmd_mlhl   = cpu_to_le32(type_tucmd);
>>>>           context_desc->mss_l4len_idx     = cpu_to_le32(mss_l4len_idx);
>>>>    }
>>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>>> index 60f9f2d..c857594 100644
>>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>>> @@ -7659,9 +7659,10 @@ static void ixgbe_service_task(struct work_struct
>>>> *work)
>>>>
>>>>    static int ixgbe_tso(struct ixgbe_ring *tx_ring,
>>>>                        struct ixgbe_tx_buffer *first,
>>>> -                    u8 *hdr_len)
>>>> +                    u8 *hdr_len,
>>>> +                    struct ixgbe_ipsec_tx_data *itd)
>>>>    {
>>>> -       u32 vlan_macip_lens, type_tucmd, mss_l4len_idx;
>>>> +       u32 vlan_macip_lens, type_tucmd, mss_l4len_idx, fceof_saidx = 0;
>>>>           struct sk_buff *skb = first->skb;
>>>>           union {
>>>>                   struct iphdr *v4;
>>>> @@ -7740,7 +7741,12 @@ static int ixgbe_tso(struct ixgbe_ring *tx_ring,
>>>>           vlan_macip_lens |= (ip.hdr - skb->data) <<
>>>> IXGBE_ADVTXD_MACLEN_SHIFT;
>>>>           vlan_macip_lens |= first->tx_flags & IXGBE_TX_FLAGS_VLAN_MASK;
>>>>
>>>> -       ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, 0, type_tucmd,
>>>> +       if (first->tx_flags & IXGBE_TX_FLAGS_IPSEC) {
>>>> +               fceof_saidx |= itd->sa_idx;
>>>> +               type_tucmd |= itd->flags | itd->trailer_len;
>>>> +       }
> 
> So just a thought. Why bother with the TX_FLAGS_CHECK at all? It seems
> like in the case that the flag isn't set you would have itd->sa_idx
> equal to 0 anyway so it would still be the same result wouldn't it? It
> would save you from having to zero both fceof_saidx and itd->sa_idx
> since you could just pass itd->sa_idx and save yourself the extra
> variable.
> 
> Also if flags and trailer_len are both being written to the same
> location why not combine them in your structure into one single 32 bit
> entry? It would allow you to essentially reduce this to one OR and you
> could just pass itd->sa_idx directly which should be a pretty
> significant savings in terms of instructions and cycles. Also you
> might want to consider bumping itd->sa_idx up to a 32b value. It will
> possibly cost you a cycle or so to convert the 16b value to a 32b
> value before writing it. If you merge the flags and trailer length you
> should have the space to spare to bump up the size.
> 
>>>> +
>>>> +       ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, fceof_saidx,
>>>> type_tucmd,
>>>>                             mss_l4len_idx);
>>>>
>>>>           return 1;
>>>> @@ -7756,10 +7762,12 @@ static inline bool ixgbe_ipv6_csum_is_sctp(struct
>>>> sk_buff *skb)
>>>>    }
>>>>
>>>>    static void ixgbe_tx_csum(struct ixgbe_ring *tx_ring,
>>>> -                         struct ixgbe_tx_buffer *first)
>>>> +                         struct ixgbe_tx_buffer *first,
>>>> +                         struct ixgbe_ipsec_tx_data *itd)
>>>>    {
>>>>           struct sk_buff *skb = first->skb;
>>>>           u32 vlan_macip_lens = 0;
>>>> +       u32 fceof_saidx = 0;
>>>>           u32 type_tucmd = 0;
>>>>
>>>>           if (skb->ip_summed != CHECKSUM_PARTIAL) {
>>>> @@ -7800,7 +7808,12 @@ static void ixgbe_tx_csum(struct ixgbe_ring
>>>> *tx_ring,
>>>>           vlan_macip_lens |= skb_network_offset(skb) <<
>>>> IXGBE_ADVTXD_MACLEN_SHIFT;
>>>>           vlan_macip_lens |= first->tx_flags & IXGBE_TX_FLAGS_VLAN_MASK;
>>>>
>>>> -       ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, 0, type_tucmd, 0);
>>>> +       if (first->tx_flags & IXGBE_TX_FLAGS_IPSEC) {
>>>> +               fceof_saidx |= itd->sa_idx;
>>>> +               type_tucmd |= itd->flags | itd->trailer_len;
>>>> +       }
>>>> +
>>>> +       ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, fceof_saidx,
>>>> type_tucmd, 0);
>>>>    }
>>>>
>>>>    #define IXGBE_SET_FLAG(_input, _flag, _result) \
>>>> @@ -7843,11 +7856,16 @@ static void ixgbe_tx_olinfo_status(union
>>>> ixgbe_adv_tx_desc *tx_desc,
>>>>                                           IXGBE_TX_FLAGS_CSUM,
>>>>                                           IXGBE_ADVTXD_POPTS_TXSM);
>>>>
>>>> -       /* enble IPv4 checksum for TSO */
>>>> +       /* enable IPv4 checksum for TSO */
>>>>           olinfo_status |= IXGBE_SET_FLAG(tx_flags,
>>>>                                           IXGBE_TX_FLAGS_IPV4,
>>>>                                           IXGBE_ADVTXD_POPTS_IXSM);
>>>>
>>>> +       /* enable IPsec */
>>>> +       olinfo_status |= IXGBE_SET_FLAG(tx_flags,
>>>> +                                       IXGBE_TX_FLAGS_IPSEC,
>>>> +                                       IXGBE_ADVTXD_POPTS_IPSEC);
>>>> +
>>>>           /*
>>>>            * Check Context must be set if Tx switch is enabled, which it
>>>>            * always is for case where virtual functions are running
>>>> @@ -8306,6 +8324,7 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff
>>>> *skb,
>>>>           u32 tx_flags = 0;
>>>>           unsigned short f;
>>>>           u16 count = TXD_USE_COUNT(skb_headlen(skb));
>>>> +       struct ixgbe_ipsec_tx_data ipsec_tx = { 0 };
>>>>           __be16 protocol = skb->protocol;
>>>>           u8 hdr_len = 0;
>>>>
>>>> @@ -8394,6 +8413,9 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff
>>>> *skb,
>>>>                   }
>>>>           }
>>>>
>>>> +       if (skb->sp && ixgbe_ipsec_tx(tx_ring, skb, protocol, &ipsec_tx))
>>>> +               tx_flags |= IXGBE_TX_FLAGS_IPSEC | IXGBE_TX_FLAGS_CC;
>>>
>>>
>>> You might just want to pull the skb->sp check into ixgbe_ipsec_tx and
>>> could pass tx_flags as a part of the first buffer. It doesn't really
>>> matter anyway as most of this will just be inlined so it will all end
>>> up a part of the same function anyway.
>>
>>
>> Since the function is defined in a different .o file, are you sure it will
>> get inlined?  I put the skb->sp check here to make sure we don't do an
>> unnecessary jump.
> 
> You're right. I forgot you are defining this in a different file.
> 
> Still I would like to see this moved down though. Where it is at
> doesn't really flow with everything else since FCoE and this aren't
> likely to ever interact so I would rather us check for FCoE and then
> get into the IPsec logic.
> 
>>>
>>> Also I would move this down so that it is handled after the fields in
>>> the first buffer_info structure are set. Then this can ll just fall
>>> inline with the TSO block and get handled there.
>>>
>>>> +
>>>>           /* record initial flags and protocol */
>>>>           first->tx_flags = tx_flags;
>>>>           first->protocol = protocol;
>>>> @@ -8410,11 +8432,11 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff
>>>> *skb,
>>>>           }
>>>>
>>>>    #endif /* IXGBE_FCOE */
>>>
>>>
>>> So if you move the function down here it will help to avoid any other
>>> complication. In addition you could follow the same logic that we do
>>> for ixgbe_tso/fso so you could drop the frame instead of transmitting
>>> it if it is requesting a bad offload.
>>
>>
>> Sure
>>
>> sln
>>
>>
>>>
>>>> -       tso = ixgbe_tso(tx_ring, first, &hdr_len);
>>>> +       tso = ixgbe_tso(tx_ring, first, &hdr_len, &ipsec_tx);
>>>>           if (tso < 0)
>>>>                   goto out_drop;
>>>>           else if (!tso)
>>>> -               ixgbe_tx_csum(tx_ring, first);
>>>> +               ixgbe_tx_csum(tx_ring, first, &ipsec_tx);
>>>>
>>>>           /* add the ATR filter if ATR is on */
>>>>           if (test_bit(__IXGBE_TX_FDIR_INIT_DONE, &tx_ring->state))
>>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
>>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
>>>> index 3df0763..0ac725fa 100644
>>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
>>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
>>>> @@ -2856,7 +2856,7 @@ union ixgbe_adv_rx_desc {
>>>>    /* Context descriptors */
>>>>    struct ixgbe_adv_tx_context_desc {
>>>>           __le32 vlan_macip_lens;
>>>> -       __le32 seqnum_seed;
>>>> +       __le32 fceof_saidx;
>>>>           __le32 type_tucmd_mlhl;
>>>>           __le32 mss_l4len_idx;
>>>>    };
>>>> --
>>>> 2.7.4
>>>>
>>>> _______________________________________________
>>>> Intel-wired-lan mailing list
>>>> Intel-wired-lan at osuosl.org
>>>> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [Intel-wired-lan] [next-queue 06/10] ixgbe: restore offloaded SAs after a reset
  2017-12-07 18:47           ` Shannon Nelson
@ 2017-12-07 21:52             ` Alexander Duyck
  -1 siblings, 0 replies; 78+ messages in thread
From: Alexander Duyck @ 2017-12-07 21:52 UTC (permalink / raw)
  To: Shannon Nelson
  Cc: intel-wired-lan, Jeff Kirsher, Steffen Klassert,
	Sowmini Varadhan, Netdev

On Thu, Dec 7, 2017 at 10:47 AM, Shannon Nelson
<shannon.nelson@oracle.com> wrote:
> On 12/7/2017 9:16 AM, Alexander Duyck wrote:
>>
>> On Wed, Dec 6, 2017 at 9:43 PM, Shannon Nelson
>> <shannon.nelson@oracle.com> wrote:
>>>
>>> On 12/5/2017 9:30 AM, Alexander Duyck wrote:
>>>>
>>>>
>>>> On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
>>>> <shannon.nelson@oracle.com> wrote:
>>>>>
>>>>>
>>>>> On a chip reset most of the table contents are lost, so must be
>
>
> <snip>
>
>>>>> +       /* reload the IP addrs */
>>>>> +       for (i = 0; i < IXGBE_IPSEC_MAX_RX_IP_COUNT; i++) {
>>>>> +               struct rx_ip_sa *ipsa = &ipsec->ip_tbl[i];
>>>>> +
>>>>> +               if (ipsa->used)
>>>>> +                       ixgbe_ipsec_set_rx_ip(hw, i, ipsa->ipaddr);
>>>>> +               else
>>>>> +                       ixgbe_ipsec_set_rx_ip(hw, i, zbuf);
>>>>
>>>>
>>>>
>>>> If we are doing a restore do we actually need to write the zero
>>>> values? If we did a reset I thought you had a function that was going
>>>> through and zeroing everything out. If so this now becomes redundant.
>>>
>>>
>>>
>>> Currently ixgbe_ipsec_clear_hw_tables() only gets run at at probe.  It
>>> should probably get run at remove as well.  Doing this is a bit of safety
>>> paranoia, and making sure the CAM memory structures that don't get
>>> cleared
>>> on reset have exactly what I expect in them.
>>
>>
>> You might just move ixgbe_ipsec_clear_hw_tables into the rest logic
>> itself. Then it covers all cases where you would be resetting the
>> hardware and expecting a consistent state. It will mean writing some
>> registers twice during the reset but it is probably better just to
>> make certain everything stays in a known good state after a reset.
>
>
> If it is a small number, e.g. 10 or 20, then you may be right.  However,
> given we have table space for 2k different SAs, at 6 writes per Tx SA and 10
> writes per Rx SA, plus 128 IP address with 4 writes each, we are already
> looking at 17K writes already to be sure the tables are clean.
>
> Unfortunately, I don't really know what a "typical" case will be, so I don't
> know how many SA we may be offloading at any one time.  But in a busy cloud
> support server, we might have nearly full tables.  If we do the full clean
> first, then have to fill all the tables, we're now looking at up to 35k
> writes slowing down the reset process.
>
> I'd rather keep it to the constant 17K writes for now, and look later at
> using the VALID bit in the IPSRXMOD to see if we can at least cut down on
> the Rx writes.
>
> sln

The reads/writes themselves should be cheap. These kind of things only
get to be really expensive when you start looking at adding delays in
between the writes/reads polling on things. As long as we aren't
waiting milliseconds on things you can write/read thousands of
registers and not even notice it.

One thing you might look at doing in order to speed some of this up a
bit would be to also combine updating the Tx SA and Rx SA in your
clear_hw_tables loop so that you could do them in parallel in your
loop instead of having to do them in series. Anyway it is just a
thought. If nothing else you might look at timing the function to see
how long it actually takes. I suspect it shouldn't be too long since
the turnaround time on the PCIe bus should be in microseconds so odds
are reading/writing 35K registers might ovinly add a few milliseconds
to total reset time.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 06/10] ixgbe: restore offloaded SAs after a reset
@ 2017-12-07 21:52             ` Alexander Duyck
  0 siblings, 0 replies; 78+ messages in thread
From: Alexander Duyck @ 2017-12-07 21:52 UTC (permalink / raw)
  To: intel-wired-lan

On Thu, Dec 7, 2017 at 10:47 AM, Shannon Nelson
<shannon.nelson@oracle.com> wrote:
> On 12/7/2017 9:16 AM, Alexander Duyck wrote:
>>
>> On Wed, Dec 6, 2017 at 9:43 PM, Shannon Nelson
>> <shannon.nelson@oracle.com> wrote:
>>>
>>> On 12/5/2017 9:30 AM, Alexander Duyck wrote:
>>>>
>>>>
>>>> On Mon, Dec 4, 2017 at 9:35 PM, Shannon Nelson
>>>> <shannon.nelson@oracle.com> wrote:
>>>>>
>>>>>
>>>>> On a chip reset most of the table contents are lost, so must be
>
>
> <snip>
>
>>>>> +       /* reload the IP addrs */
>>>>> +       for (i = 0; i < IXGBE_IPSEC_MAX_RX_IP_COUNT; i++) {
>>>>> +               struct rx_ip_sa *ipsa = &ipsec->ip_tbl[i];
>>>>> +
>>>>> +               if (ipsa->used)
>>>>> +                       ixgbe_ipsec_set_rx_ip(hw, i, ipsa->ipaddr);
>>>>> +               else
>>>>> +                       ixgbe_ipsec_set_rx_ip(hw, i, zbuf);
>>>>
>>>>
>>>>
>>>> If we are doing a restore do we actually need to write the zero
>>>> values? If we did a reset I thought you had a function that was going
>>>> through and zeroing everything out. If so this now becomes redundant.
>>>
>>>
>>>
>>> Currently ixgbe_ipsec_clear_hw_tables() only gets run at at probe.  It
>>> should probably get run at remove as well.  Doing this is a bit of safety
>>> paranoia, and making sure the CAM memory structures that don't get
>>> cleared
>>> on reset have exactly what I expect in them.
>>
>>
>> You might just move ixgbe_ipsec_clear_hw_tables into the rest logic
>> itself. Then it covers all cases where you would be resetting the
>> hardware and expecting a consistent state. It will mean writing some
>> registers twice during the reset but it is probably better just to
>> make certain everything stays in a known good state after a reset.
>
>
> If it is a small number, e.g. 10 or 20, then you may be right.  However,
> given we have table space for 2k different SAs, at 6 writes per Tx SA and 10
> writes per Rx SA, plus 128 IP address with 4 writes each, we are already
> looking at 17K writes already to be sure the tables are clean.
>
> Unfortunately, I don't really know what a "typical" case will be, so I don't
> know how many SA we may be offloading at any one time.  But in a busy cloud
> support server, we might have nearly full tables.  If we do the full clean
> first, then have to fill all the tables, we're now looking at up to 35k
> writes slowing down the reset process.
>
> I'd rather keep it to the constant 17K writes for now, and look later at
> using the VALID bit in the IPSRXMOD to see if we can at least cut down on
> the Rx writes.
>
> sln

The reads/writes themselves should be cheap. These kind of things only
get to be really expensive when you start looking at adding delays in
between the writes/reads polling on things. As long as we aren't
waiting milliseconds on things you can write/read thousands of
registers and not even notice it.

One thing you might look at doing in order to speed some of this up a
bit would be to also combine updating the Tx SA and Rx SA in your
clear_hw_tables loop so that you could do them in parallel in your
loop instead of having to do them in series. Anyway it is just a
thought. If nothing else you might look at timing the function to see
how long it actually takes. I suspect it shouldn't be too long since
the turnaround time on the PCIe bus should be in microseconds so odds
are reading/writing 35K registers might ovinly add a few milliseconds
to total reset time.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [Intel-wired-lan] [next-queue 06/10] ixgbe: restore offloaded SAs after a reset
  2017-12-07 21:52             ` Alexander Duyck
@ 2017-12-07 22:19               ` Shannon Nelson
  -1 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-07 22:19 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: intel-wired-lan, Jeff Kirsher, Steffen Klassert,
	Sowmini Varadhan, Netdev

On 12/7/2017 1:52 PM, Alexander Duyck wrote:
> 
> The reads/writes themselves should be cheap. These kind of things only
> get to be really expensive when you start looking at adding delays in
> between the writes/reads polling on things. As long as we aren't
> waiting milliseconds on things you can write/read thousands of
> registers and not even notice it.
> 
> One thing you might look at doing in order to speed some of this up a
> bit would be to also combine updating the Tx SA and Rx SA in your
> clear_hw_tables loop so that you could do them in parallel in your
> loop instead of having to do them in series. Anyway it is just a
> thought. If nothing else you might look at timing the function to see
> how long it actually takes. I suspect it shouldn't be too long since
> the turnaround time on the PCIe bus should be in microseconds so odds
> are reading/writing 35K registers might ovinly add a few milliseconds
> to total reset time.
> 

Good ideas - thanks,
sln

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [Intel-wired-lan] [next-queue 06/10] ixgbe: restore offloaded SAs after a reset
@ 2017-12-07 22:19               ` Shannon Nelson
  0 siblings, 0 replies; 78+ messages in thread
From: Shannon Nelson @ 2017-12-07 22:19 UTC (permalink / raw)
  To: intel-wired-lan

On 12/7/2017 1:52 PM, Alexander Duyck wrote:
> 
> The reads/writes themselves should be cheap. These kind of things only
> get to be really expensive when you start looking at adding delays in
> between the writes/reads polling on things. As long as we aren't
> waiting milliseconds on things you can write/read thousands of
> registers and not even notice it.
> 
> One thing you might look at doing in order to speed some of this up a
> bit would be to also combine updating the Tx SA and Rx SA in your
> clear_hw_tables loop so that you could do them in parallel in your
> loop instead of having to do them in series. Anyway it is just a
> thought. If nothing else you might look at timing the function to see
> how long it actually takes. I suspect it shouldn't be too long since
> the turnaround time on the PCIe bus should be in microseconds so odds
> are reading/writing 35K registers might ovinly add a few milliseconds
> to total reset time.
> 

Good ideas - thanks,
sln

^ permalink raw reply	[flat|nested] 78+ messages in thread

end of thread, other threads:[~2017-12-07 22:20 UTC | newest]

Thread overview: 78+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-05  5:35 [next-queue 00/10] ixgbe: Add ipsec offload Shannon Nelson
2017-12-05  5:35 ` [Intel-wired-lan] " Shannon Nelson
2017-12-05  5:35 ` [next-queue 01/10] ixgbe: clean up ipsec defines Shannon Nelson
2017-12-05  5:35   ` [Intel-wired-lan] " Shannon Nelson
2017-12-05  5:35 ` [next-queue 02/10] ixgbe: add ipsec register access routines Shannon Nelson
2017-12-05  5:35   ` [Intel-wired-lan] " Shannon Nelson
2017-12-05 16:24   ` Rustad, Mark D
2017-12-05 16:24     ` Rustad, Mark D
2017-12-05 16:56   ` Alexander Duyck
2017-12-05 16:56     ` Alexander Duyck
2017-12-07  5:43     ` Shannon Nelson
2017-12-07  5:43       ` Shannon Nelson
2017-12-07 16:02       ` Alexander Duyck
2017-12-07 16:02         ` Alexander Duyck
2017-12-07 17:03         ` Shannon Nelson
2017-12-07 17:03           ` Shannon Nelson
2017-12-05  5:35 ` [next-queue 03/10] ixgbe: add ipsec engine start and stop routines Shannon Nelson
2017-12-05  5:35   ` [Intel-wired-lan] " Shannon Nelson
2017-12-05 16:22   ` Alexander Duyck
2017-12-05 16:22     ` Alexander Duyck
2017-12-07  5:43     ` Shannon Nelson
2017-12-07  5:43       ` Shannon Nelson
2017-12-05  5:35 ` [next-queue 04/10] ixgbe: add ipsec data structures Shannon Nelson
2017-12-05  5:35   ` [Intel-wired-lan] " Shannon Nelson
2017-12-05 17:03   ` Alexander Duyck
2017-12-05 17:03     ` Alexander Duyck
2017-12-07  5:43     ` Shannon Nelson
2017-12-07  5:43       ` Shannon Nelson
2017-12-05  5:35 ` [next-queue 05/10] ixgbe: implement ipsec add and remove of offloaded SA Shannon Nelson
2017-12-05  5:35   ` [Intel-wired-lan] " Shannon Nelson
2017-12-05 17:26   ` Alexander Duyck
2017-12-05 17:26     ` Alexander Duyck
2017-12-07  5:43     ` Shannon Nelson
2017-12-07  5:43       ` Shannon Nelson
2017-12-05  5:35 ` [next-queue 06/10] ixgbe: restore offloaded SAs after a reset Shannon Nelson
2017-12-05  5:35   ` [Intel-wired-lan] " Shannon Nelson
2017-12-05 17:30   ` Alexander Duyck
2017-12-05 17:30     ` Alexander Duyck
2017-12-07  5:43     ` Shannon Nelson
2017-12-07  5:43       ` Shannon Nelson
2017-12-07 17:16       ` Alexander Duyck
2017-12-07 17:16         ` Alexander Duyck
2017-12-07 18:47         ` Shannon Nelson
2017-12-07 18:47           ` Shannon Nelson
2017-12-07 21:52           ` Alexander Duyck
2017-12-07 21:52             ` Alexander Duyck
2017-12-07 22:19             ` Shannon Nelson
2017-12-07 22:19               ` Shannon Nelson
2017-12-05  5:35 ` [next-queue 07/10] ixgbe: process the Rx ipsec offload Shannon Nelson
2017-12-05  5:35   ` [Intel-wired-lan] " Shannon Nelson
2017-12-05 17:40   ` Alexander Duyck
2017-12-05 17:40     ` Alexander Duyck
2017-12-07  5:43     ` Shannon Nelson
2017-12-07  5:43       ` Shannon Nelson
2017-12-07 17:20       ` Alexander Duyck
2017-12-07 17:20         ` Alexander Duyck
2017-12-05  5:35 ` [next-queue 08/10] ixgbe: process the Tx " Shannon Nelson
2017-12-05  5:35   ` [Intel-wired-lan] " Shannon Nelson
2017-12-05 18:13   ` Alexander Duyck
2017-12-05 18:13     ` Alexander Duyck
2017-12-07  5:43     ` Shannon Nelson
2017-12-07  5:43       ` Shannon Nelson
2017-12-07 17:56       ` Alexander Duyck
2017-12-07 17:56         ` Alexander Duyck
2017-12-07 18:50         ` Shannon Nelson
2017-12-07 18:50           ` Shannon Nelson
2017-12-05  5:35 ` [next-queue 09/10] ixgbe: ipsec offload stats Shannon Nelson
2017-12-05  5:35   ` [Intel-wired-lan] " Shannon Nelson
2017-12-05 19:53   ` Alexander Duyck
2017-12-05 19:53     ` Alexander Duyck
2017-12-07  5:43     ` Shannon Nelson
2017-12-07  5:43       ` Shannon Nelson
2017-12-05  5:35 ` [next-queue 10/10] ixgbe: register ipsec offload with the xfrm subsystem Shannon Nelson
2017-12-05  5:35   ` [Intel-wired-lan] " Shannon Nelson
2017-12-05 20:11   ` Alexander Duyck
2017-12-05 20:11     ` Alexander Duyck
2017-12-07  5:43     ` Shannon Nelson
2017-12-07  5:43       ` Shannon Nelson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.