From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jasvinder Singh Subject: [PATCH v3 0/2] librte_net: add crc computation support Date: Sun, 12 Mar 2017 21:33:31 +0000 Message-ID: <1489354413-137376-1-git-send-email-jasvinder.singh@intel.com> References: <1488283701-186162-2-git-send-email-jasvinder.singh@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: declan.doherty@intel.com, pablo.de.lara.guarch@intel.com To: dev@dpdk.org Return-path: Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id 883E31075 for ; Sun, 12 Mar 2017 22:23:40 +0100 (CET) In-Reply-To: <1488283701-186162-2-git-send-email-jasvinder.singh@intel.com> List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" In some applications, CRC (Cyclic Redundancy Check) needs to be computed or updated during packet processing operations. This patchset adds software implementation of some common standard CRCs (32-bit Ethernet CRC as per Ethernet/[ISO/IEC 8802-3] and 16-bit CCITT-CRC [ITU-T X.25]). Two versions of each 32-bit and 16-bit CRC calculation are proposed. The first version presents a fast and efficient CRC generation on IA processors by using the carry-less multiplication instruction – PCLMULQDQ (i.e SSE4.2 instrinsics). In this implementation, a parallelized folding approach has been used to first reduce an arbitrary length buffer to a small fixed size length buffer (16 bytes) with the help of precomputed constants. The resultant single 16-bytes chunk is further reduced by Barrett reduction method to generate final CRC value. For more details on the implementation, see reference [1]. The second version presents the fallback solution to support the CRC generation without needing any specific support from CPU (for examples- SSE4.2 intrinsics). It is based on generic Look-Up Table(LUT) algorithm that uses precomputed 256 element table as explained in reference[2]. During intialisation, all the data structures required for CRC computation are initialised. Also, x86 specific crc implementation (if supported by the platform) or scalar version is enabled. Following APIs have been added; (i) rte_net_crc_set_alg() (ii)rte_net_crc_calc() The first API (i) allows user to select the specific CRC implementation in run-time while the second API (ii) is used for computing the 16-bit and 32-bit CRC. References: [1] Fast CRC Computation for Generic Polynomials Using PCLMULQDQ Instruction http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/fast-crc-computation-generic-polynomials-pclmulqdq-paper.pdf [2] A PAINLESS GUIDE TO CRC ERROR DETECTION ALGORITHMS http://www.ross.net/crc/download/crc_v3.txt v3 changes: - separate the x86 specific implementation into new file - improve the unit test v2 changes: - fix build errors for target i686-native-linuxapp-gcc - fix checkpatch warnings Notes: - Build not successful with clang version earlier than 3.7.0 due to missing intrinsics. Refer dpdk known issue section for more details. Jasvinder Singh (2): librte_net: add crc compute APIs app/test: add unit test for CRC computation app/test/Makefile | 2 + app/test/test_crc.c | 265 ++++++++++++++++++++++++++++ lib/librte_net/Makefile | 3 + lib/librte_net/rte_net_crc.c | 205 ++++++++++++++++++++++ lib/librte_net/rte_net_crc.h | 104 +++++++++++ lib/librte_net/rte_net_crc_sse.h | 351 +++++++++++++++++++++++++++++++++++++ lib/librte_net/rte_net_version.map | 8 + 7 files changed, 938 insertions(+) create mode 100644 app/test/test_crc.c create mode 100644 lib/librte_net/rte_net_crc.c create mode 100644 lib/librte_net/rte_net_crc.h create mode 100644 lib/librte_net/rte_net_crc_sse.h -- 2.5.5