From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5CA10CCA47B for ; Mon, 11 Jul 2022 09:53:11 +0000 (UTC) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 5DCF040223; Mon, 11 Jul 2022 11:53:10 +0200 (CEST) Received: from mail-wr1-f45.google.com (mail-wr1-f45.google.com [209.85.221.45]) by mails.dpdk.org (Postfix) with ESMTP id 8105040223 for ; Mon, 11 Jul 2022 11:53:08 +0200 (CEST) Received: by mail-wr1-f45.google.com with SMTP id b26so6276230wrc.2 for ; Mon, 11 Jul 2022 02:53:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind.com; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=pGtrhRgIomIHDj1EFBqowbbNW+qug3PnaAo7YbBTcQM=; b=J/raQH64d3xXZskj9KD51eAlqhQmzO5WPkPJCwWHG1wt9BdkyPHjFM070Xk8lGxM/K eu3Oo+H2mYBceklUIIoJSZCYF1snEfXBKA0WLCZpgwply4oPO8qtgOS31L2PYgF2oVES je6h2ATxeQ2/BKdctNRU3Z1InxNRQrBjbL3LihmzSVvP8pk0HJS1k02m20u7S02bKDsB 4rJKOr7lc7INmWC1noSwed//v3R8k7HEkt0HyKInGVHnCTvDSEhIppNqlmHDiJulPc69 EfShEilEmDFsQ1jQhfMoD7XIwyszU8Zmj7hzrLx/8s81dkPGamuTxBwrxGNEbMfp+lQ0 /c8w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=pGtrhRgIomIHDj1EFBqowbbNW+qug3PnaAo7YbBTcQM=; b=Egu7g2dNBRaiLYy8ohQAQ5xR+Ndzzw1eR2LfEaIxiRLnUKC2dTtHwwRCR1cBI0Pp8X +TyHjL0QXSdNbI6oF45dmwGpgoyS2Et1i9XMg29jC7puPS2k2HbhH4yX4wkBrH+6fYW9 IBVaWK4WnZ+vEviB/GO9giX+X9kwR+mwm9H2zyjqotbBtuYLpbWwjIV00EWg8SXqfLpO 4imV74wXtLLLMq6ylTWzwe+SmdUqvi36Mus7wWvhouL4VdOuArBw7XnbmmJ0oY2nJYK6 EQF1OTyTypVxLDMliZYV+9VdaoArZG9+K+gdciMgFLX3ruu3A2a9HNQ05OSAkbmz7WTu mZEQ== X-Gm-Message-State: AJIora9OA5O5vQWSrh208UqehKNrlaVMstPlgpPv/9rOl+UE5E4ZgB38 tKV1A2wfr60ZEs1bou6mlg5xhA== X-Google-Smtp-Source: AGRyM1uBCY/+HNaFKOzBtmNxBDQcRy+b0bd5sjT15I+C8Xl2PwnGS3W2pdxdwJrFWtKCISA2ipUCfw== X-Received: by 2002:a5d:56ca:0:b0:21d:8b21:9fd5 with SMTP id m10-20020a5d56ca000000b0021d8b219fd5mr15941889wrw.179.1657533188254; Mon, 11 Jul 2022 02:53:08 -0700 (PDT) Received: from 6wind.com ([2a01:e0a:5ac:6460:c065:401d:87eb:9b25]) by smtp.gmail.com with ESMTPSA id t10-20020a7bc3ca000000b0039c4b518df4sm7469165wmj.5.2022.07.11.02.53.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 Jul 2022 02:53:07 -0700 (PDT) Date: Mon, 11 Jul 2022 11:53:06 +0200 From: Olivier Matz To: Mattias =?iso-8859-1?Q?R=F6nnblom?= Cc: Emil Berg , bruce.richardson@intel.com, stephen@networkplumber.org, stable@dpdk.org, bugzilla@dpdk.org, dev@dpdk.org, onar.olsen@ericsson.com, Morten =?iso-8859-1?Q?Br=F8rup?= Subject: Re: [PATCH v2 2/2] net: have checksum routines accept unaligned data Message-ID: References: <6839721a-8050-0e11-0c66-0f735ec8c56d@ericsson.com> <20220708125608.24532-1-mattias.ronnblom@ericsson.com> <20220708125608.24532-2-mattias.ronnblom@ericsson.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20220708125608.24532-2-mattias.ronnblom@ericsson.com> X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Hi, On Fri, Jul 08, 2022 at 02:56:08PM +0200, Mattias Rönnblom wrote: > __rte_raw_cksum() (used by rte_raw_cksum() among others) accessed its > data through an uint16_t pointer, which allowed the compiler to assume > the data was 16-bit aligned. This in turn would, with certain > architectures and compiler flag combinations, result in code with SIMD > load or store instructions with restrictions on data alignment. > > This patch keeps the old algorithm, but data is read using memcpy() > instead of direct pointer access, forcing the compiler to always > generate code that handles unaligned input. The __may_alias__ GCC > attribute is no longer needed. > > The data on which the Internet checksum functions operates are almost > always 16-bit aligned, but there are exceptions. In particular, the > PDCP protocol header may (literally) have an odd size. > > Performance impact seems to range from none to a very slight > regression. > > Bugzilla ID: 1035 > Cc: stable@dpdk.org > > --- Using memcpy() looks to be a good solution fix the issue, while avoiding a branch and the __may_alias__. I just have one minor comment below. > > v2: > * Simplified the odd-length conditional (Morten Brørup). > > Reviewed-by: Morten Brørup > > Signed-off-by: Mattias Rönnblom > --- > lib/net/rte_ip.h | 17 ++++++++++------- > 1 file changed, 10 insertions(+), 7 deletions(-) > > diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h > index b502481670..a0334d931e 100644 > --- a/lib/net/rte_ip.h > +++ b/lib/net/rte_ip.h > @@ -160,18 +160,21 @@ rte_ipv4_hdr_len(const struct rte_ipv4_hdr *ipv4_hdr) > static inline uint32_t > __rte_raw_cksum(const void *buf, size_t len, uint32_t sum) > { > - /* extend strict-aliasing rules */ > - typedef uint16_t __attribute__((__may_alias__)) u16_p; > - const u16_p *u16_buf = (const u16_p *)buf; > - const u16_p *end = u16_buf + len / sizeof(*u16_buf); > + const void *end; > > - for (; u16_buf != end; ++u16_buf) > - sum += *u16_buf; > + for (end = RTE_PTR_ADD(buf, (len/sizeof(uint16_t)) * sizeof(uint16_t)); What do you think about this form: for (end = RTE_PTR_ADD(buf, RTE_ALIGN_FLOOR(len, sizeof(uint16_t))); This also has the good property to solve the debate about the spaces around the '/' :) > + buf != end; buf = RTE_PTR_ADD(buf, sizeof(uint16_t))) { > + uint16_t v; > + > + memcpy(&v, buf, sizeof(uint16_t)); > + sum += v; > + } > > /* if length is odd, keeping it byte order independent */ > if (unlikely(len % 2)) { > uint16_t left = 0; > - *(unsigned char *)&left = *(const unsigned char *)end; > + > + memcpy(&left, end, 1); > sum += left; > } > > -- > 2.25.1 >