All of lore.kernel.org
 help / color / mirror / Atom feed
From: Emil Berg <emil.berg@ericsson.com>
To: "Mattias Rönnblom" <hofors@lysator.liu.se>,
	"Morten Brørup" <mb@smartsharesystems.com>,
	"bruce.richardson@intel.com" <bruce.richardson@intel.com>,
	"dev@dpdk.org" <dev@dpdk.org>
Cc: "stephen@networkplumber.org" <stephen@networkplumber.org>,
	"stable@dpdk.org" <stable@dpdk.org>,
	"bugzilla@dpdk.org" <bugzilla@dpdk.org>,
	 "olivier.matz@6wind.com" <olivier.matz@6wind.com>
Subject: RE: [PATCH v4] net: fix checksum with unaligned buffer
Date: Mon, 27 Jun 2022 12:50:53 +0000	[thread overview]
Message-ID: <AM8PR07MB76660E55460287468718E80598B99@AM8PR07MB7666.eurprd07.prod.outlook.com> (raw)
In-Reply-To: <AM8PR07MB76661BE2F6EBC456AA00333098B99@AM8PR07MB7666.eurprd07.prod.outlook.com>



> -----Original Message-----
> From: Emil Berg
> Sent: den 27 juni 2022 14:46
> To: Mattias Rönnblom <hofors@lysator.liu.se>; Morten Brørup
> <mb@smartsharesystems.com>; bruce.richardson@intel.com;
> dev@dpdk.org
> Cc: stephen@networkplumber.org; stable@dpdk.org; bugzilla@dpdk.org;
> olivier.matz@6wind.com
> Subject: RE: [PATCH v4] net: fix checksum with unaligned buffer
> 
> 
> 
> > -----Original Message-----
> > From: Mattias Rönnblom <hofors@lysator.liu.se>
> > Sent: den 27 juni 2022 14:28
> > To: Morten Brørup <mb@smartsharesystems.com>; Emil Berg
> > <emil.berg@ericsson.com>; bruce.richardson@intel.com; dev@dpdk.org
> > Cc: stephen@networkplumber.org; stable@dpdk.org; bugzilla@dpdk.org;
> > olivier.matz@6wind.com
> > Subject: Re: [PATCH v4] net: fix checksum with unaligned buffer
> >
> > On 2022-06-23 14:51, Morten Brørup wrote:
> > >> From: Morten Brørup [mailto:mb@smartsharesystems.com]
> > >> Sent: Thursday, 23 June 2022 14.39
> > >>
> > >> With this patch, the checksum can be calculated on an unaligned buffer.
> > >> I.e. the buf parameter is no longer required to be 16 bit aligned.
> > >>
> > >> The checksum is still calculated using a 16 bit aligned pointer, so
> > >> the compiler can auto-vectorize the function's inner loop.
> > >>
> > >> When the buffer is unaligned, the first byte of the buffer is
> > >> handled separately. Furthermore, the calculated checksum of the
> > >> buffer is byte shifted before being added to the initial checksum,
> > >> to compensate for the checksum having been calculated on the buffer
> > >> shifted by one byte.
> > >>
> > >> v4:
> > >> * Add copyright notice.
> > >> * Include stdbool.h (Emil Berg).
> > >> * Use RTE_PTR_ADD (Emil Berg).
> > >> * Fix one more typo in commit message. Is 'unligned' even a word?
> > >> v3:
> > >> * Remove braces from single statement block.
> > >> * Fix typo in commit message.
> > >> v2:
> > >> * Do not assume that the buffer is part of an aligned packet buffer.
> > >>
> > >> Bugzilla ID: 1035
> > >> Cc: stable@dpdk.org
> > >>
> > >> Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
> > >> ---
> > >>   lib/net/rte_ip.h | 32 +++++++++++++++++++++++++++-----
> > >>   1 file changed, 27 insertions(+), 5 deletions(-)
> > >>
> > >> diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h index
> > >> b502481670..738d643da0 100644
> > >> --- a/lib/net/rte_ip.h
> > >> +++ b/lib/net/rte_ip.h
> > >> @@ -3,6 +3,7 @@
> > >>    *      The Regents of the University of California.
> > >>    * Copyright(c) 2010-2014 Intel Corporation.
> > >>    * Copyright(c) 2014 6WIND S.A.
> > >> + * Copyright(c) 2022 SmartShare Systems.
> > >>    * All rights reserved.
> > >>    */
> > >>
> > >> @@ -15,6 +16,7 @@
> > >>    * IP-related defines
> > >>    */
> > >>
> > >> +#include <stdbool.h>
> > >>   #include <stdint.h>
> > >>
> > >>   #ifdef RTE_EXEC_ENV_WINDOWS
> > >> @@ -162,20 +164,40 @@ __rte_raw_cksum(const void *buf, size_t len,
> > >> uint32_t sum)
> > >>   {
> > >>   	/* extend strict-aliasing rules */
> > >>   	typedef uint16_t __attribute__((__may_alias__)) u16_p;
> > >> -	const u16_p *u16_buf = (const u16_p *)buf;
> > >> -	const u16_p *end = u16_buf + len / sizeof(*u16_buf);
> > >> +	const u16_p *u16_buf;
> > >> +	const u16_p *end;
> > >> +	uint32_t bsum = 0;
> > >> +	const bool unaligned = (uintptr_t)buf & 1;
> > >> +
> > >> +	/* if buffer is unaligned, keeping it byte order independent */
> > >> +	if (unlikely(unaligned)) {
> > >> +		uint16_t first = 0;
> > >> +		if (unlikely(len == 0))
> > >> +			return 0;
> > >> +		((unsigned char *)&first)[1] = *(const unsigned
> > char *)buf;
> > >> +		bsum += first;
> > >> +		buf = RTE_PTR_ADD(buf, 1);
> > >> +		len--;
> > >> +	}
> > >>
> > >> +	/* aligned access for compiler auto-vectorization */
> >
> > The compiler will be able to auto vectorize even unaligned accesses,
> > just with different instructions. From what I can tell, there's no
> > performance impact, at least not on the x86_64 systems I tried on.
> >
> > I think you should remove the first special case conditional and use
> > memcpy() instead of the cumbersome __may_alias__ construct to retrieve
> > the data.
> >
> 
> Here:
> https://www.agner.org/optimize/instruction_tables.pdf
> it lists the latency of vmovdqa (aligned) as 6 cycles and the latency for
> vmovdqu (unaligned) as 7 cycles. So I guess there can be some difference.
> Although in practice I'm not sure what difference it makes. I've not seen any
> difference in runtime between the two versions.
> 

Correction to my comment:
Those stats are for some older CPU. For some newer CPUs such as Tiger Lake the stats seem to be the same regardless of aligned or unaligned.

> > >> +	u16_buf = (const u16_p *)buf;
> > >> +	end = u16_buf + len / sizeof(*u16_buf);
> > >>   	for (; u16_buf != end; ++u16_buf)
> > >> -		sum += *u16_buf;
> > >> +		bsum += *u16_buf;
> > >>
> > >>   	/* if length is odd, keeping it byte order independent */
> > >>   	if (unlikely(len % 2)) {
> > >>   		uint16_t left = 0;
> > >>   		*(unsigned char *)&left = *(const unsigned char
> > *)end;
> > >> -		sum += left;
> > >> +		bsum += left;
> > >>   	}
> > >>
> > >> -	return sum;
> > >> +	/* if buffer is unaligned, swap the checksum bytes */
> > >> +	if (unlikely(unaligned))
> > >> +		bsum = (bsum & 0xFF00FF00) >> 8 | (bsum &
> > 0x00FF00FF) << 8;
> > >> +
> > >> +	return sum + bsum;
> > >>   }
> > >>
> > >>   /**
> > >> --
> > >> 2.17.1
> > >
> > > @Emil, thank you for thoroughly reviewing the previous versions.
> > >
> > > If your test succeeds and you are satisfied with the patch, remember
> > > to
> > reply with a "Tested-by" tag for patchwork.
> > >

  reply	other threads:[~2022-06-29 14:55 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-15  7:16 [Bug 1035] __rte_raw_cksum() crash with misaligned pointer bugzilla
2022-06-15 14:40 ` Morten Brørup
2022-06-16  5:44   ` Emil Berg
2022-06-16  6:27     ` Morten Brørup
2022-06-16  6:32     ` Emil Berg
2022-06-16  6:44       ` Morten Brørup
2022-06-16 13:58         ` Mattias Rönnblom
2022-06-16 14:36           ` Morten Brørup
2022-06-17  7:32           ` Morten Brørup
2022-06-17  8:45             ` [PATCH] net: fix checksum with unaligned buffer Morten Brørup
2022-06-17  9:06               ` Morten Brørup
2022-06-17 12:17                 ` Emil Berg
2022-06-20 10:37                 ` Emil Berg
2022-06-20 10:57                   ` Morten Brørup
2022-06-21  7:16                     ` Emil Berg
2022-06-21  8:05                       ` Morten Brørup
2022-06-21  8:23                         ` Bruce Richardson
2022-06-21  9:35                           ` Morten Brørup
2022-06-22  6:26                             ` Emil Berg
2022-06-22  9:18                               ` Bruce Richardson
2022-06-22 11:26                                 ` Morten Brørup
2022-06-22 12:25                                   ` Emil Berg
2022-06-22 14:01                                     ` Morten Brørup
2022-06-22 14:03                                       ` Emil Berg
2022-06-23  5:21                                       ` Emil Berg
2022-06-23  7:01                                         ` Morten Brørup
2022-06-23 11:39                                           ` Emil Berg
2022-06-23 12:18                                             ` Morten Brørup
2022-06-22 13:44             ` [PATCH v2] " Morten Brørup
2022-06-22 13:54             ` [PATCH v3] " Morten Brørup
2022-06-23 12:39             ` [PATCH v4] " Morten Brørup
2022-06-23 12:51               ` Morten Brørup
2022-06-27  7:56                 ` Emil Berg
2022-06-27 10:54                   ` Morten Brørup
2022-06-27 12:28                 ` Mattias Rönnblom
2022-06-27 12:46                   ` Emil Berg
2022-06-27 12:50                     ` Emil Berg [this message]
2022-06-27 13:22                       ` Morten Brørup
2022-06-27 17:22                         ` Mattias Rönnblom
2022-06-27 20:21                           ` Morten Brørup
2022-06-28  6:28                             ` Mattias Rönnblom
2022-06-30 16:28                               ` Morten Brørup
2022-07-07 15:21                                 ` Stanisław Kardach
2022-07-07 18:34                             ` [PATCH 1/2] app/test: add cksum performance test Mattias Rönnblom
2022-07-07 18:34                               ` [PATCH 2/2] net: have checksum routines accept unaligned data Mattias Rönnblom
2022-07-07 21:44                                 ` Morten Brørup
2022-07-08 12:43                                   ` Mattias Rönnblom
2022-07-08 12:56                                     ` [PATCH v2 1/2] app/test: add cksum performance test Mattias Rönnblom
2022-07-08 12:56                                       ` [PATCH v2 2/2] net: have checksum routines accept unaligned data Mattias Rönnblom
2022-07-08 14:44                                         ` Ferruh Yigit
2022-07-11  9:53                                         ` Olivier Matz
2022-07-11 10:53                                           ` Mattias Rönnblom
2022-07-11  9:47                                       ` [PATCH v2 1/2] app/test: add cksum performance test Olivier Matz
2022-07-11 10:42                                         ` Mattias Rönnblom
2022-07-11 11:33                                           ` Olivier Matz
2022-07-11 12:11                                             ` [PATCH v3 " Mattias Rönnblom
2022-07-11 12:11                                               ` [PATCH v3 2/2] net: have checksum routines accept unaligned data Mattias Rönnblom
2022-07-11 13:25                                                 ` Olivier Matz
2022-08-08  9:25                                                   ` Mattias Rönnblom
2022-09-20 12:09                                                   ` Mattias Rönnblom
2022-09-20 16:10                                                     ` Thomas Monjalon
2022-07-11 13:20                                               ` [PATCH v3 1/2] app/test: add cksum performance test Olivier Matz
2022-07-08 13:02                                     ` [PATCH 2/2] net: have checksum routines accept unaligned data Morten Brørup
2022-07-08 13:52                                       ` Mattias Rönnblom
2022-07-08 14:10                                         ` Bruce Richardson
2022-07-08 14:30                                           ` Morten Brørup
2022-06-30 17:41               ` [PATCH v4] net: fix checksum with unaligned buffer Stephen Hemminger
2022-06-30 17:45               ` Stephen Hemminger
2022-07-01  4:11                 ` Emil Berg
2022-07-01 16:50                   ` Morten Brørup
2022-07-01 17:04                     ` Stephen Hemminger
2022-07-01 20:46                       ` Morten Brørup
2022-06-16 14:09       ` [Bug 1035] __rte_raw_cksum() crash with misaligned pointer Mattias Rönnblom
2022-10-10 10:40 ` bugzilla

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AM8PR07MB76660E55460287468718E80598B99@AM8PR07MB7666.eurprd07.prod.outlook.com \
    --to=emil.berg@ericsson.com \
    --cc=bruce.richardson@intel.com \
    --cc=bugzilla@dpdk.org \
    --cc=dev@dpdk.org \
    --cc=hofors@lysator.liu.se \
    --cc=mb@smartsharesystems.com \
    --cc=olivier.matz@6wind.com \
    --cc=stable@dpdk.org \
    --cc=stephen@networkplumber.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.