DPDK-dev Archive on lore.kernel.org
 help / color / Atom feed
* [dpdk-dev] [PATCH 0/2] i40e neon vPMD optiomization for aarch64
@ 2019-08-13 10:43 Gavin Hu
  2019-08-13 10:43 ` [dpdk-dev] [PATCH 1/2] net/i40e: desc loading is unnecessarily ordered " Gavin Hu
  2019-08-13 10:43 ` [dpdk-dev] [PATCH 2/2] net/i40e: remove compiler barrier " Gavin Hu
  0 siblings, 2 replies; 3+ messages in thread
From: Gavin Hu @ 2019-08-13 10:43 UTC (permalink / raw)
  To: dev
  Cc: nd, thomas, jerinj, pbhagavatula, Honnappa.Nagarahalli,
	qi.z.zhang, bruce.richardson

Aarch64 neon vPMD survives across discontinuous DD bits, which makes
the ordering for descriptors loading unnecessary.
Similarly, the compiler barrier to order the extraction of packet
length is not needed any more when the extraction was simplified
by anothe patch.

Gavin Hu (2):
  net/i40e: desc loading is unnecessarily ordered for aarch64
  net/i40e: remove compiler barrier for aarch64

 drivers/net/i40e/i40e_rxtx_vec_neon.c | 5 -----
 1 file changed, 5 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [dpdk-dev] [PATCH 1/2] net/i40e: desc loading is unnecessarily ordered for aarch64
  2019-08-13 10:43 [dpdk-dev] [PATCH 0/2] i40e neon vPMD optiomization for aarch64 Gavin Hu
@ 2019-08-13 10:43 ` " Gavin Hu
  2019-08-13 10:43 ` [dpdk-dev] [PATCH 2/2] net/i40e: remove compiler barrier " Gavin Hu
  1 sibling, 0 replies; 3+ messages in thread
From: Gavin Hu @ 2019-08-13 10:43 UTC (permalink / raw)
  To: dev
  Cc: nd, thomas, jerinj, pbhagavatula, Honnappa.Nagarahalli,
	qi.z.zhang, bruce.richardson, stable

For x86, the descriptors needs to be loaded in order, so in between two
descriptors loading, there is a compiler barrier in place.[1]
For aarch64, a patch [2] is in place to survive with discontinuous DD bits,
the barriers can be removed to take full advantage of out-of-order
execution.

50% performance gain in the RFC2544 NDR test was measured on ThunderX2.
12.50% performan gain in the RFC2544 NDR test was measured on Ampere
eMAG80 platform.

[1] http://inbox.dpdk.org/users/039ED4275CED7440929022BC67E7061153D71548@
SHSMSX105.ccr.corp.intel.com/
[2] https://mails.dpdk.org/archives/stable/2017-October/003324.html

Fixes: ae0eb310f253 ("net/i40e: implement vector PMD for ARM")
Cc: stable@dpdk.org

Signed-off-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
---
 drivers/net/i40e/i40e_rxtx_vec_neon.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/i40e/i40e_rxtx_vec_neon.c b/drivers/net/i40e/i40e_rxtx_vec_neon.c
index 83572ef..5555e9b 100644
--- a/drivers/net/i40e/i40e_rxtx_vec_neon.c
+++ b/drivers/net/i40e/i40e_rxtx_vec_neon.c
@@ -285,7 +285,6 @@ _recv_raw_pkts_vec(struct i40e_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 		/* Read desc statuses backwards to avoid race condition */
 		/* A.1 load 4 pkts desc */
 		descs[3] =  vld1q_u64((uint64_t *)(rxdp + 3));
-		rte_rmb();
 
 		/* B.2 copy 2 mbuf point into rx_pkts  */
 		vst1q_u64((uint64_t *)&rx_pkts[pos], mbp1);
-- 
2.7.4


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [dpdk-dev] [PATCH 2/2] net/i40e: remove compiler barrier for aarch64
  2019-08-13 10:43 [dpdk-dev] [PATCH 0/2] i40e neon vPMD optiomization for aarch64 Gavin Hu
  2019-08-13 10:43 ` [dpdk-dev] [PATCH 1/2] net/i40e: desc loading is unnecessarily ordered " Gavin Hu
@ 2019-08-13 10:43 ` " Gavin Hu
  1 sibling, 0 replies; 3+ messages in thread
From: Gavin Hu @ 2019-08-13 10:43 UTC (permalink / raw)
  To: dev
  Cc: nd, thomas, jerinj, pbhagavatula, Honnappa.Nagarahalli,
	qi.z.zhang, bruce.richardson, stable

As packet length extraction code was simplified,the ordering
was not necessary any more.[1]

2% performance gain was measured on Marvell ThunderX2.
4.3% performance gain was measure on Ampere eMAG80

[1] http://mails.dpdk.org/archives/dev/2016-April/037529.html

Fixes: ae0eb310f253 ("net/i40e: implement vector PMD for ARM")
Cc: stable@dpdk.org

Signed-off-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
---
 drivers/net/i40e/i40e_rxtx_vec_neon.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/net/i40e/i40e_rxtx_vec_neon.c b/drivers/net/i40e/i40e_rxtx_vec_neon.c
index 5555e9b..864eb9a 100644
--- a/drivers/net/i40e/i40e_rxtx_vec_neon.c
+++ b/drivers/net/i40e/i40e_rxtx_vec_neon.c
@@ -307,9 +307,6 @@ _recv_raw_pkts_vec(struct i40e_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 			rte_mbuf_prefetch_part2(rx_pkts[pos + 3]);
 		}
 
-		/* avoid compiler reorder optimization */
-		rte_compiler_barrier();
-
 		/* pkt 3,4 shift the pktlen field to be 16-bit aligned*/
 		uint32x4_t len3 = vshlq_u32(vreinterpretq_u32_u64(descs[3]),
 					    len_shl);
-- 
2.7.4


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, back to index

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-13 10:43 [dpdk-dev] [PATCH 0/2] i40e neon vPMD optiomization for aarch64 Gavin Hu
2019-08-13 10:43 ` [dpdk-dev] [PATCH 1/2] net/i40e: desc loading is unnecessarily ordered " Gavin Hu
2019-08-13 10:43 ` [dpdk-dev] [PATCH 2/2] net/i40e: remove compiler barrier " Gavin Hu

DPDK-dev Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/dpdk-dev/0 dpdk-dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dpdk-dev dpdk-dev/ https://lore.kernel.org/dpdk-dev \
		dev@dpdk.org dpdk-dev@archiver.kernel.org
	public-inbox-index dpdk-dev


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/ public-inbox