From mboxrd@z Thu Jan 1 00:00:00 1970 From: Yongseok Koh Subject: [PATCH v3 0/5] net/mlx5: add vectorized Rx/Tx burst for x86 Date: Wed, 5 Jul 2017 11:12:23 -0700 Message-ID: References: <20170628230403.10142-1-yskoh@mellanox.com> Mime-Version: 1.0 Content-Type: text/plain Cc: dev@dpdk.org, adrien.mazarguil@6wind.com, nelio.laranjeiro@6wind.com, Yongseok Koh To: ferruh.yigit@intel.com Return-path: Received: from EUR01-DB5-obe.outbound.protection.outlook.com (mail-db5eur01on0069.outbound.protection.outlook.com [104.47.2.69]) by dpdk.org (Postfix) with ESMTP id 8A96C2C13 for ; Wed, 5 Jul 2017 20:12:41 +0200 (CEST) In-Reply-To: <20170628230403.10142-1-yskoh@mellanox.com> List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This is to introduce more efficient Rx/Tx burst functions using SIMD instructions. Currently it is only supported by 64bit x86 having SSE4.1. >>From functional perspective, Rx burst function is equivalent to the existing mlx5_rx_burst() except for scatter support, which will be added soon. Tx burst function supports multi-segment packets and offload flags unless it is disabled by txq_flags. However, disabling those features will bring a little higher performance. v3: * Remove requirement of SSE4.1 as DPDK now mandates SSE4.2 support at least. * Bug fix in "net/mlx5: select Rx/Tx callbacks when starting device" - Need to re-select Rx burst func when chaning MTU size. * Resolved an optimization issue of gcc-6 in rxq_burst_v() - Bit shift (<<) for 128b vector type is compiled differently. 'psllq' is needed instead of 'sal'. * Minor changes to address what is mentioned by review. - Remove 'pragma' for PEDANTIC - Make mlx5_ptype_table global. - Change name of some inline funcs which also exist in mlx4 by the same name. - Fix comments and indentation/spacing. v2: * Streamline redundant conditional clauses in txq_complete(). * Remove the mempool pointer in txq->mp2mr structure. * Fix indentation and spacing. Yongseok Koh (5): net/mlx5: change indexing for Tx SW ring net/mlx5: free buffers in bulk on Tx completion net/mlx5: use buffer address for LKEY search net/mlx5: select Rx/Tx callbacks when starting device net/mlx5: add vectorized Rx/Tx burst for SSE4.1 drivers/net/mlx5/Makefile | 3 + drivers/net/mlx5/mlx5_defs.h | 18 + drivers/net/mlx5/mlx5_ethdev.c | 47 +- drivers/net/mlx5/mlx5_mr.c | 17 +- drivers/net/mlx5/mlx5_rxq.c | 57 +- drivers/net/mlx5/mlx5_rxtx.c | 459 ++++------- drivers/net/mlx5/mlx5_rxtx.h | 290 ++++++- drivers/net/mlx5/mlx5_rxtx_vec_sse.c | 1378 ++++++++++++++++++++++++++++++++++ drivers/net/mlx5/mlx5_trigger.c | 3 + drivers/net/mlx5/mlx5_txq.c | 23 +- 10 files changed, 1927 insertions(+), 368 deletions(-) create mode 100644 drivers/net/mlx5/mlx5_rxtx_vec_sse.c -- 2.11.0