linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Laight <David.Laight@ACULAB.COM>
To: 'Matteo Croce' <mcroce@linux.microsoft.com>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Nick Kossifidis <mick@ics.forth.gr>, Guo Ren <guoren@kernel.org>,
	Christoph Hellwig <hch@infradead.org>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	"Emil Renner Berthing" <kernel@esmil.dk>,
	Drew Fustini <drew@beagleboard.org>,
	linux-arch <linux-arch@vger.kernel.org>,
	Nick Desaulniers <ndesaulniers@google.com>,
	linux-riscv <linux-riscv@lists.infradead.org>
Subject: RE: [PATCH v2 0/3] lib/string: optimized mem* functions
Date: Mon, 12 Jul 2021 08:15:41 +0000	[thread overview]
Message-ID: <af19820cd24544cd8833d6db6d38154b@AcuMS.aculab.com> (raw)
In-Reply-To: <CAFnufp1d+FH1K5QAx+Z=KvMUvrveJAVnjJJc8xoDCn2wmzUOoQ@mail.gmail.com>

From: Matteo Croce
> Sent: 11 July 2021 00:08
> 
> On Sat, Jul 10, 2021 at 11:31 PM Andrew Morton
> <akpm@linux-foundation.org> wrote:
> >
> > On Fri,  2 Jul 2021 14:31:50 +0200 Matteo Croce <mcroce@linux.microsoft.com> wrote:
> >
> > > From: Matteo Croce <mcroce@microsoft.com>
> > >
> > > Rewrite the generic mem{cpy,move,set} so that memory is accessed with
> > > the widest size possible, but without doing unaligned accesses.
> > >
> > > This was originally posted as C string functions for RISC-V[1], but as
> > > there was no specific RISC-V code, it was proposed for the generic
> > > lib/string.c implementation.
> > >
> > > Tested on RISC-V and on x86_64 by undefining __HAVE_ARCH_MEM{CPY,SET,MOVE}
> > > and HAVE_EFFICIENT_UNALIGNED_ACCESS.
> > >
> > > These are the performances of memcpy() and memset() of a RISC-V machine
> > > on a 32 mbyte buffer:
> > >
> > > memcpy:
> > > original aligned:      75 Mb/s
> > > original unaligned:    75 Mb/s
> > > new aligned:          114 Mb/s
> > > new unaligned:                107 Mb/s
> > >
> > > memset:
> > > original aligned:     140 Mb/s
> > > original unaligned:   140 Mb/s
> > > new aligned:          241 Mb/s
> > > new unaligned:                241 Mb/s
> >
> > Did you record the x86_64 performance?
> >
> >
> > Which other architectures are affected by this change?
> 
> x86_64 won't use these functions because it defines __HAVE_ARCH_MEMCPY
> and has optimized implementations in arch/x86/lib.
> Anyway, I was curious and I tested them on x86_64 too, there was zero
> gain over the generic ones.

x86 performance (and attainable performance) does depend on the cpu
micro-archiecture.

Any recent 'desktop' intel cpu will almost certainly manage to
re-order the execution of almost any copy loop and attain 1 write per clock.
(Even the trivial 'while (count--) *dest++ = *src++;' loop.)

The same isn't true of the Atom based cpu that may be on small servers.
Theses are no slouches (eg 4 cores at 2.4GHz) but only have limited
out-of-order execution and so are much more sensitive to instruction
ordering.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

      reply	other threads:[~2021-07-12  9:04 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-02 12:31 [PATCH v2 0/3] lib/string: optimized mem* functions Matteo Croce
2021-07-02 12:31 ` [PATCH v2 1/3] lib/string: optimized memcpy Matteo Croce
2021-07-02 14:37   ` Ben Dooks
2021-07-02 14:44     ` Matteo Croce
2021-07-02 12:31 ` [PATCH v2 2/3] lib/string: optimized memmove Matteo Croce
2021-07-02 12:31 ` [PATCH v2 3/3] lib/string: optimized memset Matteo Croce
2021-07-10 21:31 ` [PATCH v2 0/3] lib/string: optimized mem* functions Andrew Morton
2021-07-10 23:07   ` Matteo Croce
2021-07-12  8:15     ` David Laight [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=af19820cd24544cd8833d6db6d38154b@AcuMS.aculab.com \
    --to=david.laight@aculab.com \
    --cc=akpm@linux-foundation.org \
    --cc=drew@beagleboard.org \
    --cc=guoren@kernel.org \
    --cc=hch@infradead.org \
    --cc=kernel@esmil.dk \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=mcroce@linux.microsoft.com \
    --cc=mick@ics.forth.gr \
    --cc=ndesaulniers@google.com \
    --cc=palmer@dabbelt.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).