linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Laight <David.Laight@ACULAB.COM>
To: 'Segher Boessenkool' <segher@kernel.crashing.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>,
	Nick Desaulniers <ndesaulniers@google.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Bill Wendling <morbo@google.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	"maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)"
	<x86@kernel.org>, Nathan Chancellor <nathan@kernel.org>,
	"Juergen Gross" <jgross@suse.com>,
	Peter Zijlstra <peterz@infradead.org>,
	"Andy Lutomirski" <luto@kernel.org>,
	"llvm@lists.linux.dev" <llvm@lists.linux.dev>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-toolchains <linux-toolchains@vger.kernel.org>
Subject: RE: [PATCH v5] x86: use builtins to read eflags
Date: Fri, 18 Mar 2022 23:52:01 +0000	[thread overview]
Message-ID: <04f65d1a90f640d4943c810f37016b01@AcuMS.aculab.com> (raw)
In-Reply-To: <20220318230425.GT614@gate.crashing.org>

From: Segher Boessenkool
> Sent: 18 March 2022 23:04
...
> The vast majority of compiler builtins are for simple transformations
> that the machine can do, for example with vector instructions.  Using
> such builtins does *not* instruct the compiler to use those machine
> insns, even if the builtin name would suggest that; instead, it asks to
> have code generated that has such semantics.  So it can be optimised by
> the compiler, much more than what can be done with inline asm.

Bah.
I wrote some small functions to convert blocks of 80 audio
samples between C 'float' and the 8-bit u-law and A-law floating
point formats - one set use the F16C conversions for denormalised
values.
I really want the instructions I've asked for in the order
I've asked for them.
I don't want the compiler doing stupid things.
(Like deciding to try to vectorise the bit of code at the end
that handled non 80 byte blocks.)

> It also can be optimised better by the compiler than if you would
> open-code the transforms (if you ask to frobnicate something, the
> compiler will know you want to frobnicate that thing, and it will not
> always recognise that is what you want if you just write it out in more
> general code).

Yep.
If I write 'for (i = 0; i < n; i++) foo[i] = bar[i]'
I want a loop - not a call to memcpy().
If I want a memcpy() I'll call memcpy().

And if I write:
	do {
		sum64a += buff32[0];
		sum64b += buff32[1];
		sum64a += buff32[2];
		sum64b += buff32[3];
		buff += 4;
	} while (buff != lim);
I don't want to see 'buff[1] + buff[2]' anywhere!
That loop has half a chance of running at 8 bytes/clock.
But not how gcc compiles it.

> Well-chosen builtin names are also much more readable than the best
> inline asm can ever be, and it can express much more in a much smaller
> space, without so much opportunity to make mistakes, either.

Hmmm...
Trying to write that SSE2/AVX code was a nightmare.
Chase through the cpu instruction set trying to sort out
the name of the required instruction.
Then search through the 'intrinsic' header to find the
name of the builtin.
Then disassemble the code to check the I'd got the right one.
I'm pretty sure the asm would have been shorter
and needed just as many comments.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


      reply	other threads:[~2022-03-18 23:52 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20211215211847.206208-1-morbo@google.com>
     [not found] ` <87mtl1l63m.ffs@tglx>
2021-12-16 19:55   ` [PATCH] x86: use builtins to read eflags Bill Wendling
2021-12-17 12:48     ` Peter Zijlstra
2021-12-17 19:39     ` Thomas Gleixner
2022-03-14 23:09     ` H. Peter Anvin
2022-03-15  0:08       ` Bill Wendling
     [not found] ` <20211215232605.GJ16608@worktop.programming.kicks-ass.net>
2021-12-16 20:00   ` Bill Wendling
2021-12-16 20:07     ` Nick Desaulniers
2021-12-29  2:12 ` [PATCH v2] " Bill Wendling
2022-01-27 20:56   ` Bill Wendling
2022-02-04  0:16   ` Thomas Gleixner
2022-02-04  0:58     ` Bill Wendling
2022-02-04  0:57   ` [PATCH v3] " Bill Wendling
2022-02-07 22:11     ` Nick Desaulniers
2022-02-08  9:14       ` David Laight
2022-02-08 23:18         ` Bill Wendling
2022-02-14 23:53         ` Nick Desaulniers
2022-02-10 22:31     ` [PATCH v4] " Bill Wendling
2022-02-11 16:40       ` David Laight
2022-02-11 19:25         ` Bill Wendling
2022-02-11 22:09           ` David Laight
2022-02-11 23:33             ` Bill Wendling
2022-02-12  0:24           ` Nick Desaulniers
2022-02-12  9:23             ` Bill Wendling
2022-02-15  0:33               ` Nick Desaulniers
2022-03-01 20:19       ` [PATCH v5] " Bill Wendling
2022-03-14 23:07         ` Bill Wendling
     [not found]           ` <AC3D873E-A28B-41F1-8BF4-2F6F37BCEEB4@zytor.com>
2022-03-15  7:19             ` Bill Wendling
2022-03-17 15:43               ` H. Peter Anvin
2022-03-17 18:00                 ` Nick Desaulniers
2022-03-17 18:52                   ` Linus Torvalds
2022-03-17 19:45                     ` Bill Wendling
2022-03-17 20:13                       ` Linus Torvalds
2022-03-17 21:10                         ` Bill Wendling
2022-03-17 21:21                           ` Linus Torvalds
2022-03-17 21:45                             ` Bill Wendling
2022-03-17 22:51                               ` Linus Torvalds
2022-03-17 23:14                                 ` Linus Torvalds
2022-03-17 23:19                                 ` Segher Boessenkool
2022-03-17 23:31                                   ` Linus Torvalds
2022-03-18  0:05                                     ` Segher Boessenkool
2022-03-17 22:37                       ` Segher Boessenkool
2022-03-17 20:13                     ` Florian Weimer
2022-03-17 20:36                       ` Linus Torvalds
2022-03-18  0:25                         ` Segher Boessenkool
2022-03-18  1:21                           ` Linus Torvalds
2022-03-18  1:50                             ` Linus Torvalds
2022-03-17 21:05                     ` Andrew Cooper
2022-03-17 21:39                       ` Linus Torvalds
2022-03-18 17:59                         ` Andy Lutomirski
2022-03-18 18:19                           ` Linus Torvalds
2022-03-18 21:48                             ` Andrew Cooper
2022-03-18 23:10                               ` Linus Torvalds
2022-03-18 23:42                                 ` Segher Boessenkool
2022-03-19  1:13                                   ` Linus Torvalds
2022-03-19 23:15                                   ` Andy Lutomirski
2022-03-18 22:09                             ` Segher Boessenkool
2022-03-18 22:33                               ` H. Peter Anvin
2022-03-18 22:36                               ` David Laight
2022-03-18 22:47                                 ` H. Peter Anvin
2022-03-18 22:43                             ` David Laight
2022-03-18 23:03                               ` H. Peter Anvin
2022-03-18 23:04                         ` Segher Boessenkool
2022-03-18 23:52                           ` David Laight [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=04f65d1a90f640d4943c810f37016b01@AcuMS.aculab.com \
    --to=david.laight@aculab.com \
    --cc=Andrew.Cooper3@citrix.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=jgross@suse.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-toolchains@vger.kernel.org \
    --cc=llvm@lists.linux.dev \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=morbo@google.com \
    --cc=nathan@kernel.org \
    --cc=ndesaulniers@google.com \
    --cc=peterz@infradead.org \
    --cc=segher@kernel.crashing.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).