From: Matteo Croce <mcroce@linux.microsoft.com>
To: Nick Kossifidis <mick@ics.forth.gr>
Cc: linux-riscv <linux-riscv@lists.infradead.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
linux-arch <linux-arch@vger.kernel.org>,
Paul Walmsley <paul.walmsley@sifive.com>,
Palmer Dabbelt <palmer@dabbelt.com>,
Albert Ou <aou@eecs.berkeley.edu>,
Atish Patra <atish.patra@wdc.com>,
Emil Renner Berthing <kernel@esmil.dk>,
Akira Tsukamoto <akira.tsukamoto@gmail.com>,
Drew Fustini <drew@beagleboard.org>,
Bin Meng <bmeng.cn@gmail.com>,
David Laight <David.Laight@aculab.com>,
Guo Ren <guoren@kernel.org>
Subject: Re: [PATCH v3 3/3] riscv: optimized memset
Date: Wed, 23 Jun 2021 02:08:30 +0200 [thread overview]
Message-ID: <CAFnufp2w1TGtaBjfTtsBpDatgAtATRZbB4MURV3tLh1fi-W1JQ@mail.gmail.com> (raw)
In-Reply-To: <17cd289430f08f2b75b7f04242c646f6@mailhost.ics.forth.gr>
On Tue, Jun 22, 2021 at 3:07 AM Nick Kossifidis <mick@ics.forth.gr> wrote:
>
> Στις 2021-06-17 18:27, Matteo Croce έγραψε:
> > +
> > +void *__memset(void *s, int c, size_t count)
> > +{
> > + union types dest = { .u8 = s };
> > +
> > + if (count >= MIN_THRESHOLD) {
> > + const int bytes_long = BITS_PER_LONG / 8;
>
> You could make 'const int bytes_long = BITS_PER_LONG / 8;' and 'const
> int mask = bytes_long - 1;' from your memcpy patch visible to memset as
> well (static const...) and use them here (mask would make more sense to
> be named as word_mask).
>
I'll do
> > + unsigned long cu = (unsigned long)c;
> > +
> > + /* Compose an ulong with 'c' repeated 4/8 times */
> > + cu |= cu << 8;
> > + cu |= cu << 16;
> > +#if BITS_PER_LONG == 64
> > + cu |= cu << 32;
> > +#endif
> > +
>
> You don't have to create cu here, you'll fill dest buffer with 'c'
> anyway so after filling up enough 'c's to be able to grab an aligned
> word full of them from dest, you can just grab that word and keep
> filling up dest with it.
>
I tried that, but this way I have to wait 8 bytes more before starting
the memset.
And, the machine code needed to generate 'cu' is just 6 instructions on riscv:
slli a5,a0,8
or a5,a5,a0
slli a0,a5,16
or a0,a0,a5
slli a5,a0,32
or a0,a5,a0
so probably it's not worth it.
> > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
> > + /* Fill the buffer one byte at time until the destination
> > + * is aligned on a 32/64 bit boundary.
> > + */
> > + for (; count && dest.uptr % bytes_long; count--)
>
> You could reuse & mask here instead of % bytes_long.
>
Sure, even if the machine code will be the same.
> > + *dest.u8++ = c;
> > +#endif
>
> I noticed you also used CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS on your
> memcpy patch, is it worth it here ? To begin with riscv doesn't set it
> and even if it did we are talking about a loop that will run just a few
> times to reach the alignment boundary (worst case scenario it'll run 7
> times), I don't think we gain much here, even for archs that have
> efficient unaligned access.
It doesn't _now_, but maybe in the future we will have a CPU which
handles unaligned accesses correctly!
--
per aspera ad upstream
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
next prev parent reply other threads:[~2021-06-23 0:09 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-17 15:27 [PATCH v3 0/3] riscv: optimized mem* functions Matteo Croce
2021-06-17 15:27 ` [PATCH v3 1/3] riscv: optimized memcpy Matteo Croce
2021-06-18 14:06 ` kernel test robot
2021-06-21 14:26 ` Christoph Hellwig
2021-06-22 8:19 ` David Laight
2021-06-22 22:53 ` Matteo Croce
2021-06-22 22:00 ` Matteo Croce
2021-06-22 0:14 ` Nick Kossifidis
2021-06-22 23:35 ` Matteo Croce
2021-06-23 9:48 ` Nick Kossifidis
2021-06-17 15:27 ` [PATCH v3 2/3] riscv: optimized memmove Matteo Croce
2021-06-21 14:28 ` Christoph Hellwig
2021-06-22 0:46 ` Nick Kossifidis
2021-06-30 4:40 ` kernel test robot
2021-06-17 15:27 ` [PATCH v3 3/3] riscv: optimized memset Matteo Croce
2021-06-21 14:32 ` Christoph Hellwig
2021-06-22 1:07 ` Nick Kossifidis
2021-06-22 8:38 ` David Laight
2021-06-23 1:14 ` Matteo Croce
2021-06-23 9:05 ` David Laight
2021-06-23 0:08 ` Matteo Croce [this message]
2021-06-22 1:09 ` [PATCH v3 0/3] riscv: optimized mem* functions Nick Kossifidis
2021-06-22 2:39 ` Guo Ren
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAFnufp2w1TGtaBjfTtsBpDatgAtATRZbB4MURV3tLh1fi-W1JQ@mail.gmail.com \
--to=mcroce@linux.microsoft.com \
--cc=David.Laight@aculab.com \
--cc=akira.tsukamoto@gmail.com \
--cc=aou@eecs.berkeley.edu \
--cc=atish.patra@wdc.com \
--cc=bmeng.cn@gmail.com \
--cc=drew@beagleboard.org \
--cc=guoren@kernel.org \
--cc=kernel@esmil.dk \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=mick@ics.forth.gr \
--cc=palmer@dabbelt.com \
--cc=paul.walmsley@sifive.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).