All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Xie, WeiX" <weix.xie@intel.com>
To: "Yang, MurphyX" <murphyx.yang@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
Cc: "Yang, Qiming" <qiming.yang@intel.com>,
	"Zhang, Qi Z" <qi.z.zhang@intel.com>,
	"Yang, SteveX" <stevex.yang@intel.com>,
	"Rong, Leyi" <leyi.rong@intel.com>,
	"Lu, Wenzhuo" <wenzhuo.lu@intel.com>,
	"Yang, MurphyX" <murphyx.yang@intel.com>
Subject: Re: [dpdk-dev] [PATCH v7] net/ice: fix outer checksum on cvl unknown
Date: Mon, 23 Nov 2020 08:56:08 +0000	[thread overview]
Message-ID: <e25993722f844d228c62fcb463ec3b2f@intel.com> (raw)
In-Reply-To: <20201123083455.87600-1-murphyx.yang@intel.com>

Tested-by:  Xie,WeiX < weix.xie@intel.com>

Regards,
Xie Wei


> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Murphy Yang
> Sent: Monday, November 23, 2020 4:35 PM
> To: dev@dpdk.org
> Cc: Yang, Qiming <qiming.yang@intel.com>; Zhang, Qi Z
> <qi.z.zhang@intel.com>; Yang, SteveX <stevex.yang@intel.com>; Rong, Leyi
> <leyi.rong@intel.com>; Lu, Wenzhuo <wenzhuo.lu@intel.com>; Yang,
> MurphyX <murphyx.yang@intel.com>
> Subject: [dpdk-dev] [PATCH v7] net/ice: fix outer checksum on cvl unknown
> 
> When received tunneled packets, the testpmd output log shows 'ol_flags'
> value always is 'PKT_RX_OUTER_L4_CKSUM_UNKNOWN', but expected
> value is 'PKT_RX_OUTER_L4_CKSUM_GOOD' or
> 'PKT_RX_OUTER_L4_CKSUM_BAD'.
> 
> Add the 'PKT_RX_OUTER_L4_CKSUM_GOOD' and
> 'PKT_RX_OUTER_L4_CKSUM_BAD' to 'flags' for normal path,
> 'l3_l4_flags_shuf' for AVX2 and AVX512 vector path and 'cksum_flags' for SSE
> vector path to ensure that the 'ol_flags'
> can match correct flags.
> 
> Fixes: dbf3c0e77a22 ("net/ice: handle Rx flex descriptor")
> Fixes: 4ab7dbb0a0f6 ("net/ice: switch to Rx flexible descriptor in AVX path")
> Fixes: ece1f8a8f1c8 ("net/ice: switch to flexible descriptor in SSE path")
> 
> Signed-off-by: Murphy Yang <murphyx.yang@intel.com>
> ---
> v7:
> - fix compile error with default target on SSE vector path.
> v6:
> - rename variable name.
> - update comments.
> v5:
> - fix outer L4 checksum mask for vector path.
> v4:
> - cover AVX512 vector path.
> v3:
> - add PKT_RX_OUTER_L4_CKSUM_GOOD in AVX2 and SSE vector path.
> - rename variable name.
> v2:
> - cover AVX2 and SSE vector path
>  drivers/net/ice/ice_rxtx.c            |   5 ++
>  drivers/net/ice/ice_rxtx_vec_avx2.c   | 118 +++++++++++++++++++-------
>  drivers/net/ice/ice_rxtx_vec_avx512.c | 117 ++++++++++++++++++-------
>  drivers/net/ice/ice_rxtx_vec_sse.c    |  78 ++++++++++++-----
>  4 files changed, 233 insertions(+), 85 deletions(-)
> 
> diff --git a/drivers/net/ice/ice_rxtx.c b/drivers/net/ice/ice_rxtx.c index
> 5fbd68eafc..24a7caeb98 100644
> --- a/drivers/net/ice/ice_rxtx.c
> +++ b/drivers/net/ice/ice_rxtx.c
> @@ -1451,6 +1451,11 @@ ice_rxd_error_to_pkt_flags(uint16_t stat_err0)
>  	if (unlikely(stat_err0 & (1 <<
> ICE_RX_FLEX_DESC_STATUS0_XSUM_EIPE_S)))
>  		flags |= PKT_RX_EIP_CKSUM_BAD;
> 
> +	if (unlikely(stat_err0 & (1 <<
> ICE_RX_FLEX_DESC_STATUS0_XSUM_EUDPE_S)))
> +		flags |= PKT_RX_OUTER_L4_CKSUM_BAD;
> +	else
> +		flags |= PKT_RX_OUTER_L4_CKSUM_GOOD;
> +
>  	return flags;
>  }
> 
> diff --git a/drivers/net/ice/ice_rxtx_vec_avx2.c
> b/drivers/net/ice/ice_rxtx_vec_avx2.c
> index b72a9e7025..7838e17787 100644
> --- a/drivers/net/ice/ice_rxtx_vec_avx2.c
> +++ b/drivers/net/ice/ice_rxtx_vec_avx2.c
> @@ -251,43 +251,88 @@ _ice_recv_raw_pkts_vec_avx2(struct
> ice_rx_queue *rxq, struct rte_mbuf **rx_pkts,
>  	 * bit13 is for VLAN indication.
>  	 */
>  	const __m256i flags_mask =
> -		 _mm256_set1_epi32((7 << 4) | (1 << 12) | (1 << 13));
> +		 _mm256_set1_epi32((0xF << 4) | (1 << 12) | (1 << 13));
>  	/**
>  	 * data to be shuffled by the result of the flags mask shifted by 4
>  	 * bits.  This gives use the l3_l4 flags.
>  	 */
> -	const __m256i l3_l4_flags_shuf = _mm256_set_epi8(0, 0, 0, 0, 0, 0, 0,
> 0,
> -			/* shift right 1 bit to make sure it not exceed 255 */
> -			(PKT_RX_EIP_CKSUM_BAD |
> PKT_RX_L4_CKSUM_BAD |
> -			 PKT_RX_IP_CKSUM_BAD) >> 1,
> -			(PKT_RX_EIP_CKSUM_BAD |
> PKT_RX_L4_CKSUM_BAD |
> -			 PKT_RX_IP_CKSUM_GOOD) >> 1,
> -			(PKT_RX_EIP_CKSUM_BAD |
> PKT_RX_L4_CKSUM_GOOD |
> -			 PKT_RX_IP_CKSUM_BAD) >> 1,
> -			(PKT_RX_EIP_CKSUM_BAD |
> PKT_RX_L4_CKSUM_GOOD |
> -			 PKT_RX_IP_CKSUM_GOOD) >> 1,
> -			(PKT_RX_L4_CKSUM_BAD |
> PKT_RX_IP_CKSUM_BAD) >> 1,
> -			(PKT_RX_L4_CKSUM_BAD |
> PKT_RX_IP_CKSUM_GOOD) >> 1,
> -			(PKT_RX_L4_CKSUM_GOOD |
> PKT_RX_IP_CKSUM_BAD) >> 1,
> -			(PKT_RX_L4_CKSUM_GOOD |
> PKT_RX_IP_CKSUM_GOOD) >> 1,
> -			/* second 128-bits */
> -			0, 0, 0, 0, 0, 0, 0, 0,
> -			(PKT_RX_EIP_CKSUM_BAD |
> PKT_RX_L4_CKSUM_BAD |
> -			 PKT_RX_IP_CKSUM_BAD) >> 1,
> -			(PKT_RX_EIP_CKSUM_BAD |
> PKT_RX_L4_CKSUM_BAD |
> -			 PKT_RX_IP_CKSUM_GOOD) >> 1,
> -			(PKT_RX_EIP_CKSUM_BAD |
> PKT_RX_L4_CKSUM_GOOD |
> -			 PKT_RX_IP_CKSUM_BAD) >> 1,
> -			(PKT_RX_EIP_CKSUM_BAD |
> PKT_RX_L4_CKSUM_GOOD |
> -			 PKT_RX_IP_CKSUM_GOOD) >> 1,
> -			(PKT_RX_L4_CKSUM_BAD |
> PKT_RX_IP_CKSUM_BAD) >> 1,
> -			(PKT_RX_L4_CKSUM_BAD |
> PKT_RX_IP_CKSUM_GOOD) >> 1,
> -			(PKT_RX_L4_CKSUM_GOOD |
> PKT_RX_IP_CKSUM_BAD) >> 1,
> -			(PKT_RX_L4_CKSUM_GOOD |
> PKT_RX_IP_CKSUM_GOOD) >> 1);
> +	const __m256i l3_l4_flags_shuf =
> +		_mm256_set_epi8((PKT_RX_OUTER_L4_CKSUM_BAD >> 20
> |
> +		 PKT_RX_EIP_CKSUM_BAD | PKT_RX_L4_CKSUM_BAD |
> +		  PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_BAD | PKT_RX_IP_CKSUM_GOOD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_GOOD | PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_GOOD | PKT_RX_IP_CKSUM_GOOD) >>
> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_L4_CKSUM_BAD  |
> +		 PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_L4_CKSUM_BAD  |
> +		 PKT_RX_IP_CKSUM_GOOD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_L4_CKSUM_GOOD |
> +		 PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_L4_CKSUM_GOOD |
> +		 PKT_RX_IP_CKSUM_GOOD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_BAD | PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_BAD | PKT_RX_IP_CKSUM_GOOD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_GOOD | PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_GOOD | PKT_RX_IP_CKSUM_GOOD) >>
> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_L4_CKSUM_BAD |
> +		 PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_L4_CKSUM_BAD |
> +		 PKT_RX_IP_CKSUM_GOOD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_L4_CKSUM_GOOD |
> +		 PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_L4_CKSUM_GOOD |
> +		 PKT_RX_IP_CKSUM_GOOD) >> 1,
> +		/**
> +		 * second 128-bits
> +		 * shift right 20 bits to use the low two bits to indicate
> +		 * outer checksum status
> +		 * shift right 1 bit to make sure it not exceed 255
> +		 */
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_BAD | PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_BAD | PKT_RX_IP_CKSUM_GOOD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_GOOD | PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_GOOD | PKT_RX_IP_CKSUM_GOOD) >>
> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_L4_CKSUM_BAD  |
> +		 PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_L4_CKSUM_BAD  |
> +		 PKT_RX_IP_CKSUM_GOOD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_L4_CKSUM_GOOD |
> +		 PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_L4_CKSUM_GOOD |
> +		 PKT_RX_IP_CKSUM_GOOD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_BAD | PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_BAD | PKT_RX_IP_CKSUM_GOOD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_GOOD | PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_GOOD | PKT_RX_IP_CKSUM_GOOD) >>
> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_L4_CKSUM_BAD |
> +		 PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_L4_CKSUM_BAD |
> +		 PKT_RX_IP_CKSUM_GOOD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_L4_CKSUM_GOOD |
> +		 PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_L4_CKSUM_GOOD |
> +		 PKT_RX_IP_CKSUM_GOOD) >> 1);
>  	const __m256i cksum_mask =
> -		 _mm256_set1_epi32(PKT_RX_IP_CKSUM_GOOD |
> PKT_RX_IP_CKSUM_BAD |
> -				   PKT_RX_L4_CKSUM_GOOD |
> PKT_RX_L4_CKSUM_BAD |
> -				   PKT_RX_EIP_CKSUM_BAD);
> +		 _mm256_set1_epi32(PKT_RX_IP_CKSUM_MASK |
> +				   PKT_RX_L4_CKSUM_MASK |
> +				   PKT_RX_EIP_CKSUM_BAD |
> +				   PKT_RX_OUTER_L4_CKSUM_MASK);
>  	/**
>  	 * data to be shuffled by result of flag mask, shifted down 12.
>  	 * If RSS(bit12)/VLAN(bit13) are set,
> @@ -469,6 +514,15 @@ _ice_recv_raw_pkts_vec_avx2(struct ice_rx_queue
> *rxq, struct rte_mbuf **rx_pkts,
>  		__m256i l3_l4_flags =
> _mm256_shuffle_epi8(l3_l4_flags_shuf,
>  				_mm256_srli_epi32(flag_bits, 4));
>  		l3_l4_flags = _mm256_slli_epi32(l3_l4_flags, 1);
> +
> +		__m256i l4_outer_mask = _mm256_set1_epi32(0x6);
> +		__m256i l4_outer_flags =
> +				_mm256_and_si256(l3_l4_flags,
> l4_outer_mask);
> +		l4_outer_flags = _mm256_slli_epi32(l4_outer_flags, 20);
> +
> +		__m256i l3_l4_mask = _mm256_set1_epi32(~0x6);
> +		l3_l4_flags = _mm256_and_si256(l3_l4_flags, l3_l4_mask);
> +		l3_l4_flags = _mm256_or_si256(l3_l4_flags, l4_outer_flags);
>  		l3_l4_flags = _mm256_and_si256(l3_l4_flags, cksum_mask);
>  		/* set rss and vlan flags */
>  		const __m256i rss_vlan_flag_bits =
> diff --git a/drivers/net/ice/ice_rxtx_vec_avx512.c
> b/drivers/net/ice/ice_rxtx_vec_avx512.c
> index df5d2be1e6..fd5d724329 100644
> --- a/drivers/net/ice/ice_rxtx_vec_avx512.c
> +++ b/drivers/net/ice/ice_rxtx_vec_avx512.c
> @@ -230,43 +230,88 @@ _ice_recv_raw_pkts_vec_avx512(struct
> ice_rx_queue *rxq,
>  	 * bit13 is for VLAN indication.
>  	 */
>  	const __m256i flags_mask =
> -		 _mm256_set1_epi32((7 << 4) | (1 << 12) | (1 << 13));
> +		 _mm256_set1_epi32((0xF << 4) | (1 << 12) | (1 << 13));
>  	/**
>  	 * data to be shuffled by the result of the flags mask shifted by 4
>  	 * bits.  This gives use the l3_l4 flags.
>  	 */
> -	const __m256i l3_l4_flags_shuf = _mm256_set_epi8(0, 0, 0, 0, 0, 0, 0,
> 0,
> -			/* shift right 1 bit to make sure it not exceed 255 */
> -			(PKT_RX_EIP_CKSUM_BAD |
> PKT_RX_L4_CKSUM_BAD |
> -			 PKT_RX_IP_CKSUM_BAD) >> 1,
> -			(PKT_RX_EIP_CKSUM_BAD |
> PKT_RX_L4_CKSUM_BAD |
> -			 PKT_RX_IP_CKSUM_GOOD) >> 1,
> -			(PKT_RX_EIP_CKSUM_BAD |
> PKT_RX_L4_CKSUM_GOOD |
> -			 PKT_RX_IP_CKSUM_BAD) >> 1,
> -			(PKT_RX_EIP_CKSUM_BAD |
> PKT_RX_L4_CKSUM_GOOD |
> -			 PKT_RX_IP_CKSUM_GOOD) >> 1,
> -			(PKT_RX_L4_CKSUM_BAD |
> PKT_RX_IP_CKSUM_BAD) >> 1,
> -			(PKT_RX_L4_CKSUM_BAD |
> PKT_RX_IP_CKSUM_GOOD) >> 1,
> -			(PKT_RX_L4_CKSUM_GOOD |
> PKT_RX_IP_CKSUM_BAD) >> 1,
> -			(PKT_RX_L4_CKSUM_GOOD |
> PKT_RX_IP_CKSUM_GOOD) >> 1,
> -			/* 2nd 128-bits */
> -			0, 0, 0, 0, 0, 0, 0, 0,
> -			(PKT_RX_EIP_CKSUM_BAD |
> PKT_RX_L4_CKSUM_BAD |
> -			 PKT_RX_IP_CKSUM_BAD) >> 1,
> -			(PKT_RX_EIP_CKSUM_BAD |
> PKT_RX_L4_CKSUM_BAD |
> -			 PKT_RX_IP_CKSUM_GOOD) >> 1,
> -			(PKT_RX_EIP_CKSUM_BAD |
> PKT_RX_L4_CKSUM_GOOD |
> -			 PKT_RX_IP_CKSUM_BAD) >> 1,
> -			(PKT_RX_EIP_CKSUM_BAD |
> PKT_RX_L4_CKSUM_GOOD |
> -			 PKT_RX_IP_CKSUM_GOOD) >> 1,
> -			(PKT_RX_L4_CKSUM_BAD |
> PKT_RX_IP_CKSUM_BAD) >> 1,
> -			(PKT_RX_L4_CKSUM_BAD |
> PKT_RX_IP_CKSUM_GOOD) >> 1,
> -			(PKT_RX_L4_CKSUM_GOOD |
> PKT_RX_IP_CKSUM_BAD) >> 1,
> -			(PKT_RX_L4_CKSUM_GOOD |
> PKT_RX_IP_CKSUM_GOOD) >> 1);
> +	const __m256i l3_l4_flags_shuf =
> +		_mm256_set_epi8((PKT_RX_OUTER_L4_CKSUM_BAD >> 20
> |
> +		 PKT_RX_EIP_CKSUM_BAD | PKT_RX_L4_CKSUM_BAD |
> +		  PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_BAD | PKT_RX_IP_CKSUM_GOOD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_GOOD | PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_GOOD | PKT_RX_IP_CKSUM_GOOD) >>
> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_L4_CKSUM_BAD  |
> +		 PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_L4_CKSUM_BAD  |
> +		 PKT_RX_IP_CKSUM_GOOD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_L4_CKSUM_GOOD |
> +		 PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_L4_CKSUM_GOOD |
> +		 PKT_RX_IP_CKSUM_GOOD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_BAD | PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_BAD | PKT_RX_IP_CKSUM_GOOD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_GOOD | PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_GOOD | PKT_RX_IP_CKSUM_GOOD) >>
> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_L4_CKSUM_BAD |
> +		 PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_L4_CKSUM_BAD |
> +		 PKT_RX_IP_CKSUM_GOOD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_L4_CKSUM_GOOD |
> +		 PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_L4_CKSUM_GOOD |
> +		 PKT_RX_IP_CKSUM_GOOD) >> 1,
> +		/**
> +		 * second 128-bits
> +		 * shift right 20 bits to use the low two bits to indicate
> +		 * outer checksum status
> +		 * shift right 1 bit to make sure it not exceed 255
> +		 */
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_BAD | PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_BAD | PKT_RX_IP_CKSUM_GOOD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_GOOD | PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_GOOD | PKT_RX_IP_CKSUM_GOOD) >>
> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_L4_CKSUM_BAD  |
> +		 PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_L4_CKSUM_BAD  |
> +		 PKT_RX_IP_CKSUM_GOOD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_L4_CKSUM_GOOD |
> +		 PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_L4_CKSUM_GOOD |
> +		 PKT_RX_IP_CKSUM_GOOD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_BAD | PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_BAD | PKT_RX_IP_CKSUM_GOOD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_GOOD | PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_GOOD | PKT_RX_IP_CKSUM_GOOD) >>
> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_L4_CKSUM_BAD |
> +		 PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_L4_CKSUM_BAD |
> +		 PKT_RX_IP_CKSUM_GOOD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_L4_CKSUM_GOOD |
> +		 PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_L4_CKSUM_GOOD |
> +		 PKT_RX_IP_CKSUM_GOOD) >> 1);
>  	const __m256i cksum_mask =
> -		 _mm256_set1_epi32(PKT_RX_IP_CKSUM_GOOD |
> PKT_RX_IP_CKSUM_BAD |
> -				   PKT_RX_L4_CKSUM_GOOD |
> PKT_RX_L4_CKSUM_BAD |
> -				   PKT_RX_EIP_CKSUM_BAD);
> +		 _mm256_set1_epi32(PKT_RX_IP_CKSUM_MASK |
> +				   PKT_RX_L4_CKSUM_MASK |
> +				   PKT_RX_EIP_CKSUM_BAD |
> +				   PKT_RX_OUTER_L4_CKSUM_MASK);
>  	/**
>  	 * data to be shuffled by result of flag mask, shifted down 12.
>  	 * If RSS(bit12)/VLAN(bit13) are set,
> @@ -451,6 +496,14 @@ _ice_recv_raw_pkts_vec_avx512(struct
> ice_rx_queue *rxq,
>  		__m256i l3_l4_flags =
> _mm256_shuffle_epi8(l3_l4_flags_shuf,
>  				_mm256_srli_epi32(flag_bits, 4));
>  		l3_l4_flags = _mm256_slli_epi32(l3_l4_flags, 1);
> +		__m256i l4_outer_mask = _mm256_set1_epi32(0x6);
> +		__m256i l4_outer_flags =
> +				_mm256_and_si256(l3_l4_flags,
> l4_outer_mask);
> +		l4_outer_flags = _mm256_slli_epi32(l4_outer_flags, 20);
> +
> +		__m256i l3_l4_mask = _mm256_set1_epi32(~0x6);
> +		l3_l4_flags = _mm256_and_si256(l3_l4_flags, l3_l4_mask);
> +		l3_l4_flags = _mm256_or_si256(l3_l4_flags, l4_outer_flags);
>  		l3_l4_flags = _mm256_and_si256(l3_l4_flags, cksum_mask);
>  		/* set rss and vlan flags */
>  		const __m256i rss_vlan_flag_bits =
> diff --git a/drivers/net/ice/ice_rxtx_vec_sse.c
> b/drivers/net/ice/ice_rxtx_vec_sse.c
> index 626364719b..87e0c3db2e 100644
> --- a/drivers/net/ice/ice_rxtx_vec_sse.c
> +++ b/drivers/net/ice/ice_rxtx_vec_sse.c
> @@ -114,39 +114,67 @@ ice_rx_desc_to_olflags_v(struct ice_rx_queue *rxq,
> __m128i descs[4],
>  	 * bit12 for RSS indication.
>  	 * bit13 for VLAN indication.
>  	 */
> -	const __m128i desc_mask = _mm_set_epi32(0x3070, 0x3070,
> -						0x3070, 0x3070);
> -
> +	const __m128i desc_mask = _mm_set_epi32(0x30f0, 0x30f0,
> +						0x30f0, 0x30f0);
>  	const __m128i cksum_mask =
> _mm_set_epi32(PKT_RX_IP_CKSUM_MASK |
>  						 PKT_RX_L4_CKSUM_MASK |
> +
> PKT_RX_OUTER_L4_CKSUM_MASK |
>  						 PKT_RX_EIP_CKSUM_BAD,
>  						 PKT_RX_IP_CKSUM_MASK |
>  						 PKT_RX_L4_CKSUM_MASK |
> +
> PKT_RX_OUTER_L4_CKSUM_MASK |
>  						 PKT_RX_EIP_CKSUM_BAD,
>  						 PKT_RX_IP_CKSUM_MASK |
>  						 PKT_RX_L4_CKSUM_MASK |
> +
> PKT_RX_OUTER_L4_CKSUM_MASK |
>  						 PKT_RX_EIP_CKSUM_BAD,
>  						 PKT_RX_IP_CKSUM_MASK |
>  						 PKT_RX_L4_CKSUM_MASK |
> +
> PKT_RX_OUTER_L4_CKSUM_MASK |
>  						 PKT_RX_EIP_CKSUM_BAD);
> 
>  	/* map the checksum, rss and vlan fields to the checksum, rss
>  	 * and vlan flag
>  	 */
> -	const __m128i cksum_flags = _mm_set_epi8(0, 0, 0, 0, 0, 0, 0, 0,
> -			/* shift right 1 bit to make sure it not exceed 255 */
> -			(PKT_RX_EIP_CKSUM_BAD |
> PKT_RX_L4_CKSUM_BAD |
> -			 PKT_RX_IP_CKSUM_BAD) >> 1,
> -			(PKT_RX_EIP_CKSUM_BAD |
> PKT_RX_L4_CKSUM_BAD |
> -			 PKT_RX_IP_CKSUM_GOOD) >> 1,
> -			(PKT_RX_EIP_CKSUM_BAD |
> PKT_RX_L4_CKSUM_GOOD |
> -			 PKT_RX_IP_CKSUM_BAD) >> 1,
> -			(PKT_RX_EIP_CKSUM_BAD |
> PKT_RX_L4_CKSUM_GOOD |
> -			 PKT_RX_IP_CKSUM_GOOD) >> 1,
> -			(PKT_RX_L4_CKSUM_BAD |
> PKT_RX_IP_CKSUM_BAD) >> 1,
> -			(PKT_RX_L4_CKSUM_BAD |
> PKT_RX_IP_CKSUM_GOOD) >> 1,
> -			(PKT_RX_L4_CKSUM_GOOD |
> PKT_RX_IP_CKSUM_BAD) >> 1,
> -			(PKT_RX_L4_CKSUM_GOOD |
> PKT_RX_IP_CKSUM_GOOD) >> 1);
> +	const __m128i cksum_flags =
> +		_mm_set_epi8((PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> +		 PKT_RX_EIP_CKSUM_BAD | PKT_RX_L4_CKSUM_BAD |
> +		  PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_BAD | PKT_RX_IP_CKSUM_GOOD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_GOOD | PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_GOOD | PKT_RX_IP_CKSUM_GOOD) >>
> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_L4_CKSUM_BAD |
> +		 PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_L4_CKSUM_BAD |
> +		 PKT_RX_IP_CKSUM_GOOD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_L4_CKSUM_GOOD |
> +		 PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_BAD >> 20 |
> PKT_RX_L4_CKSUM_GOOD |
> +		 PKT_RX_IP_CKSUM_GOOD) >> 1,
> +		/**
> +		 * shift right 20 bits to use the low two bits to indicate
> +		 * outer checksum status
> +		 * shift right 1 bit to make sure it not exceed 255
> +		 */
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_BAD | PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_BAD | PKT_RX_IP_CKSUM_GOOD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_GOOD | PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_EIP_CKSUM_BAD |
> +		 PKT_RX_L4_CKSUM_GOOD | PKT_RX_IP_CKSUM_GOOD) >>
> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_L4_CKSUM_BAD |
> +		 PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_L4_CKSUM_BAD |
> +		 PKT_RX_IP_CKSUM_GOOD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_L4_CKSUM_GOOD |
> +		 PKT_RX_IP_CKSUM_BAD) >> 1,
> +		(PKT_RX_OUTER_L4_CKSUM_GOOD >> 20 |
> PKT_RX_L4_CKSUM_GOOD |
> +		 PKT_RX_IP_CKSUM_GOOD) >> 1);
> 
>  	const __m128i rss_vlan_flags = _mm_set_epi8(0, 0, 0, 0,
>  			0, 0, 0, 0,
> @@ -166,6 +194,14 @@ ice_rx_desc_to_olflags_v(struct ice_rx_queue *rxq,
> __m128i descs[4],
>  	flags = _mm_shuffle_epi8(cksum_flags, tmp_desc);
>  	/* then we shift left 1 bit */
>  	flags = _mm_slli_epi32(flags, 1);
> +
> +	__m128i l4_outer_mask = _mm_set_epi32(0x6, 0x6, 0x6, 0x6);
> +	__m128i l4_outer_flags = _mm_and_si128(flags, l4_outer_mask);
> +	l4_outer_flags = _mm_slli_epi32(l4_outer_flags, 20);
> +
> +	__m128i l3_l4_mask = _mm_set_epi32(~0x6, ~0x6, ~0x6, ~0x6);
> +	__m128i l3_l4_flags = _mm_and_si128(flags, l3_l4_mask);
> +	flags = _mm_or_si128(l3_l4_flags, l4_outer_flags);
>  	/* we need to mask out the reduntant bits introduced by RSS or
>  	 * VLAN fields.
>  	 */
> @@ -217,10 +253,10 @@ ice_rx_desc_to_olflags_v(struct ice_rx_queue *rxq,
> __m128i descs[4],
>  	 * appropriate flags means that we have to do a shift and blend for
>  	 * each mbuf before we do the write.
>  	 */
> -	rearm0 = _mm_blend_epi16(mbuf_init, _mm_slli_si128(flags, 8),
> 0x10);
> -	rearm1 = _mm_blend_epi16(mbuf_init, _mm_slli_si128(flags, 4),
> 0x10);
> -	rearm2 = _mm_blend_epi16(mbuf_init, flags, 0x10);
> -	rearm3 = _mm_blend_epi16(mbuf_init, _mm_srli_si128(flags, 4),
> 0x10);
> +	rearm0 = _mm_blend_epi16(mbuf_init, _mm_slli_si128(flags, 8),
> 0x30);
> +	rearm1 = _mm_blend_epi16(mbuf_init, _mm_slli_si128(flags, 4),
> 0x30);
> +	rearm2 = _mm_blend_epi16(mbuf_init, flags, 0x30);
> +	rearm3 = _mm_blend_epi16(mbuf_init, _mm_srli_si128(flags, 4),
> 0x30);
> 
>  	/* write the rearm data and the olflags in one write */
>  	RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, ol_flags) !=
> --
> 2.17.1


  reply	other threads:[~2020-11-23  8:56 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-14  8:34 [dpdk-dev] [PATCH] net/ice: fix outher chksum on cvl unknown murphy yang
2020-10-16  5:07 ` [dpdk-dev] [PATCH 1/1] " murphy yang
2020-10-27  6:09   ` [dpdk-dev] [PATCH v3] " Murphy Yang
2020-11-03  6:43     ` [dpdk-dev] [PATCH v4] " Murphy Yang
2020-11-03  6:43       ` [dpdk-dev] [PATCH] " Murphy Yang
2020-11-04  9:26       ` [dpdk-dev] [PATCH v5] " Murphy Yang
2020-11-09  6:06         ` [dpdk-dev] [PATCH v6] net/ice: fix outer checksum " Murphy Yang
2020-11-13  3:16           ` Xie, WeiX
2020-11-13  5:33             ` Zhang, Qi Z
2020-11-13 11:51           ` Ferruh Yigit
2020-11-23  8:34           ` [dpdk-dev] [PATCH v7] " Murphy Yang
2020-11-23  8:56             ` Xie, WeiX [this message]
2020-12-15  8:10             ` [dpdk-dev] [PATCH v8] net/ice: fix outer checksum unknown Murphy Yang
2020-12-15 12:11               ` Zhang, Qi Z

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e25993722f844d228c62fcb463ec3b2f@intel.com \
    --to=weix.xie@intel.com \
    --cc=dev@dpdk.org \
    --cc=leyi.rong@intel.com \
    --cc=murphyx.yang@intel.com \
    --cc=qi.z.zhang@intel.com \
    --cc=qiming.yang@intel.com \
    --cc=stevex.yang@intel.com \
    --cc=wenzhuo.lu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.