All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] Optimize memcpy for AVX512 platforms
@ 2016-01-14  6:13 Zhihong Wang
  2016-01-14  6:13 ` [PATCH 1/4] lib/librte_eal: Identify AVX512 CPU flag Zhihong Wang
                   ` (5 more replies)
  0 siblings, 6 replies; 23+ messages in thread
From: Zhihong Wang @ 2016-01-14  6:13 UTC (permalink / raw)
  To: dev

This patch set optimizes DPDK memcpy for AVX512 platforms, to make full
utilization of hardware resources and deliver high performance.

In current DPDK, memcpy holds a large proportion of execution time in
libs like Vhost, especially for large packets, and this patch can bring
considerable benefits.

The implementation is based on the current DPDK memcpy framework, some
background introduction can be found in these threads:
http://dpdk.org/ml/archives/dev/2014-November/008158.html
http://dpdk.org/ml/archives/dev/2015-January/011800.html

Code changes are:

  1. Read CPUID to check if AVX512 is supported by CPU

  2. Predefine AVX512 macro if AVX512 is enabled by compiler

  3. Implement AVX512 memcpy and choose the right implementation based on
     predefined macros

  4. Decide alignment unit for memcpy perf test based on predefined macros

Zhihong Wang (4):
  lib/librte_eal: Identify AVX512 CPU flag
  mk: Predefine AVX512 macro for compiler
  lib/librte_eal: Optimize memcpy for AVX512 platforms
  app/test: Adjust alignment unit for memcpy perf test

 app/test/test_memcpy_perf.c                        |   6 +
 .../common/include/arch/x86/rte_cpuflags.h         |   2 +
 .../common/include/arch/x86/rte_memcpy.h           | 247 ++++++++++++++++++++-
 mk/rte.cpuflags.mk                                 |   4 +
 4 files changed, 255 insertions(+), 4 deletions(-)

-- 
2.5.0

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2017-09-18  5:11 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-14  6:13 [PATCH 0/4] Optimize memcpy for AVX512 platforms Zhihong Wang
2016-01-14  6:13 ` [PATCH 1/4] lib/librte_eal: Identify AVX512 CPU flag Zhihong Wang
2016-01-14  6:13 ` [PATCH 2/4] mk: Predefine AVX512 macro for compiler Zhihong Wang
2016-01-14  6:13 ` [PATCH 3/4] lib/librte_eal: Optimize memcpy for AVX512 platforms Zhihong Wang
2016-01-14  6:13 ` [PATCH 4/4] app/test: Adjust alignment unit for memcpy perf test Zhihong Wang
2016-01-14 16:48 ` [PATCH 0/4] Optimize memcpy for AVX512 platforms Stephen Hemminger
2016-01-15  6:39   ` Wang, Zhihong
2016-01-15 22:03     ` Vincent JARDIN
2016-01-18  3:05 ` [PATCH v2 0/5] " Zhihong Wang
2016-01-18  3:05   ` [PATCH v2 1/5] lib/librte_eal: Identify AVX512 CPU flag Zhihong Wang
2016-01-18  3:05   ` [PATCH v2 2/5] mk: Predefine AVX512 macro for compiler Zhihong Wang
2016-01-18  3:05   ` [PATCH v2 3/5] lib/librte_eal: Optimize memcpy for AVX512 platforms Zhihong Wang
2016-01-18  3:05   ` [PATCH v2 4/5] app/test: Adjust alignment unit for memcpy perf test Zhihong Wang
2016-01-18  3:05   ` [PATCH v2 5/5] lib/librte_eal: Tune memcpy for prior platforms Zhihong Wang
2016-01-18 20:06   ` [PATCH v2 0/5] Optimize memcpy for AVX512 platforms Stephen Hemminger
2016-01-19  2:37     ` Wang, Zhihong
2016-01-27 15:23   ` Thomas Monjalon
2016-01-28  6:09     ` Wang, Zhihong
2016-01-27 15:30   ` Thomas Monjalon
2016-01-27 18:48     ` Ananyev, Konstantin
2016-01-27 20:18       ` Thomas Monjalon
2017-08-30  9:37   ` linhaifeng
2017-09-18  5:10     ` Wang, Zhihong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.