From mboxrd@z Thu Jan 1 00:00:00 1970 From: Zhihong Wang Subject: [PATCH v2 0/5] Optimize memcpy for AVX512 platforms Date: Sun, 17 Jan 2016 22:05:09 -0500 Message-ID: <1453086314-30158-1-git-send-email-zhihong.wang@intel.com> References: <1452752002-107586-1-git-send-email-zhihong.wang@intel.com> To: dev@dpdk.org Return-path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 609505A62 for ; Mon, 18 Jan 2016 11:08:38 +0100 (CET) In-Reply-To: <1452752002-107586-1-git-send-email-zhihong.wang@intel.com> List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This patch set optimizes DPDK memcpy for AVX512 platforms, to make full utilization of hardware resources and deliver high performance. In current DPDK, memcpy holds a large proportion of execution time in libs like Vhost, especially for large packets, and this patch can bring considerable benefits. The implementation is based on the current DPDK memcpy framework, some background introduction can be found in these threads: http://dpdk.org/ml/archives/dev/2014-November/008158.html http://dpdk.org/ml/archives/dev/2015-January/011800.html Code changes are: 1. Read CPUID to check if AVX512 is supported by CPU 2. Predefine AVX512 macro if AVX512 is enabled by compiler 3. Implement AVX512 memcpy and choose the right implementation based on predefined macros 4. Decide alignment unit for memcpy perf test based on predefined macros -------------- Changes in v2: 1. Tune performance for prior platforms Zhihong Wang (5): lib/librte_eal: Identify AVX512 CPU flag mk: Predefine AVX512 macro for compiler lib/librte_eal: Optimize memcpy for AVX512 platforms app/test: Adjust alignment unit for memcpy perf test lib/librte_eal: Tune memcpy for prior platforms app/test/test_memcpy_perf.c | 6 + .../common/include/arch/x86/rte_cpuflags.h | 2 + .../common/include/arch/x86/rte_memcpy.h | 269 ++++++++++++++++++++- mk/rte.cpuflags.mk | 4 + 4 files changed, 268 insertions(+), 13 deletions(-) -- 2.5.0