linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Rasmus Villemoes <linux@rasmusvillemoes.dk>
To: Borislav Petkov <bp@alien8.de>,
	Rasmus Villemoes <mail@rasmusvillemoes.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	x86-ml <x86@kernel.org>, Andy Lutomirski <luto@kernel.org>,
	Josh Poimboeuf <jpoimboe@redhat.com>,
	lkml <linux-kernel@vger.kernel.org>
Subject: Re: [RFC] Improve memset
Date: Mon, 16 Sep 2019 11:18:33 +0200	[thread overview]
Message-ID: <3fc31917-9452-3a10-d11d-056bf2d8b97d@rasmusvillemoes.dk> (raw)
In-Reply-To: <20190913163645.GC4190@zn.tnic>

On 13/09/2019 18.36, Borislav Petkov wrote:
> On Fri, Sep 13, 2019 at 12:42:32PM +0200, Borislav Petkov wrote:
>> Or should we talk to Intel hw folks about it...
> 
> Or, I can do something like this, while waiting. Benchmark at the end.
> 
> The numbers are from a KBL box:
> 
> model           : 158
> model name      : Intel(R) Core(TM) i5-9600K CPU @ 3.70GHz
> stepping        : 12
> 
> and if I'm not doing anything wrong with the benchmark 

Eh, this benchmark doesn't seem to provide any hints on where to set the
cut-off for a compile-time constant n, i.e. the 32 in

  __b_c_p(n) && n <= 32

- unless gcc has unrolled your loop completely, which I find highly
unlikely.

(the asm looks
> ok

By "looks ok", do you mean the the builtin_memset() have been made into
calls to libc memset(), or how has gcc expanded that? And if so, what's
the disassembly of your libc's memset()? The thing is, what needs to be
compared is how a rep;stosb of 32 bytes compares to 4 immediate stores.

In fact, perhaps we shouldn't even try to find a cutoff. If __b_c_p(n),
just use __builtin_memset unconditionally. If n is smallish, gcc will do
a few stores, and if n is largish and gcc ends up emitting a call to
memset(), well, we can optimize memset() itself based on cpu
capabilities _and_ it's not the call/ret that will dominate. There are
also optimization and diagnostic advantages of having gcc know the
semantics of the memset() call (e.g. the tr.b DSE you showed).

but I could very well be missing something), the numbers say that
> the REP; STOSB is better from sizes of 8 and upwards and up to two
> cachelines we're pretty much on-par with the builtin variant.

I don't think that's what the numbers say.

Rasmus

  reply	other threads:[~2019-09-16  9:18 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-13  7:22 [RFC] Improve memset Borislav Petkov
2019-09-13  7:35 ` Ingo Molnar
2019-09-13  7:50   ` Borislav Petkov
2019-09-13  8:51 ` Rasmus Villemoes
2019-09-13  9:00 ` Linus Torvalds
2019-09-13  9:18   ` Rasmus Villemoes
2019-09-13 10:42     ` Borislav Petkov
2019-09-13 16:36       ` Borislav Petkov
2019-09-16  9:18         ` Rasmus Villemoes [this message]
2019-09-16 17:25           ` Linus Torvalds
2019-09-16 17:40             ` Andy Lutomirski
2019-09-16 21:29               ` Linus Torvalds
2019-09-16 23:13                 ` Andy Lutomirski
2019-09-16 23:26                   ` Linus Torvalds
2019-09-17  8:15             ` Borislav Petkov
2019-09-17 10:55             ` David Laight
2019-09-17 20:10 ` Josh Poimboeuf
2019-09-17 20:45   ` Linus Torvalds
2019-09-19 12:55     ` Borislav Petkov
2019-09-19 12:49   ` Borislav Petkov
2019-09-14  9:29 Alexey Dobriyan
2019-09-14 11:39 ` Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3fc31917-9452-3a10-d11d-056bf2d8b97d@rasmusvillemoes.dk \
    --to=linux@rasmusvillemoes.dk \
    --cc=bp@alien8.de \
    --cc=jpoimboe@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mail@rasmusvillemoes.dk \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).