linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Segher Boessenkool <segher@kernel.crashing.org>
To: Bill Wendling <morbo@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Nick Desaulniers <ndesaulniers@google.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	"maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)"
	<x86@kernel.org>, Nathan Chancellor <nathan@kernel.org>,
	Juergen Gross <jgross@suse.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Andy Lutomirski <luto@kernel.org>,
	llvm@lists.linux.dev, LKML <linux-kernel@vger.kernel.org>,
	linux-toolchains <linux-toolchains@vger.kernel.org>
Subject: Re: [PATCH v5] x86: use builtins to read eflags
Date: Thu, 17 Mar 2022 17:37:56 -0500	[thread overview]
Message-ID: <20220317223756.GM614@gate.crashing.org> (raw)
In-Reply-To: <CAGG=3QW2ey2w91TxqJ6tzfJOswhTce2e0QTW7kAWyvxeiO+VNg@mail.gmail.com>

Hi!

On Thu, Mar 17, 2022 at 12:45:30PM -0700, Bill Wendling wrote:
> On Thu, Mar 17, 2022 at 11:52 AM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> > It is perhaps telling that the LLVM discussion I found seems to talk
> > more about the performance impact, not about the fact that THE
> > GENERATED CODE WAS WRONG.
> >
> I think that's a bit unfair. There seemed to be a general consensus
> that the code was wrong and needed to be fixed. However, the compiler
> is also expected to generate the most performant code that it can.

I think Linus' point is that correctness is required, and performance
is just a nice to have.  I don't think you disagree, I sure don't :-)

> > In fact, gcc pretty much documents that "cc" clobbers doesn't do
> > anything on x86, and isn't needed. It just assumes that the arithmetic
> > flags always get clobbered, because on x86 that's pretty much the
> > case. They have sometimes been added for documentation purposes in the
> > kernel, but that's all they are.

GCC on x86 clobbers the arithmetic flags in every asm.  Long ago the x86
port still used CC0 (an internal representation of the condition code
flags), which acted effectively like clobbering the flags on every
instruction (CC0 will finally be gone in GCC 12 btw).  The x86 port
stopped using CC0 (and so started using the automatic "cc" clobber, to
keep old inline asm working) in 1999 :-)

"cc" is a valid clobber on every port, but it doesn't necessarily do
anything, and if it does, it does not necessarily correspond to any more
specific register, neither to a GCC register nor to a hardware register.
It also is not really useful for generic code, since the asm itself is
by nature pretty much target-specific :-)

> > But the whole "you can't move _other_ things that you don't even
> > understand around this either" is equally important. A "disable
> > interrupts" could easily be protecting a "read and modify a CPU MSR
> > value" too - no real "memory" access necessarily involved, but
> > "memory" is the only way we can tell you "don't move this".
> >
> And yet that's not guaranteed. From
> https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html:
> 
> ```
> Note that the compiler can move even volatile asm instructions
> relative to other code, including across jump instructions. For
> example, on many targets there is a system register that controls the
> rounding mode of floating-point operations. Setting it with a volatile
> asm statement, as in the following PowerPC example, does not work
> reliably.
> 
>   asm volatile("mtfsf 255, %0" : : "f" (fpenv));
>   sum = x + y;
> 
> The solution is to reference the "sum" variable in the asm:
> 
>   asm volatile ("mtfsf 255,%1" : "=X" (sum) : "f" (fpenv));
> ```
> 
> Note that the solution _isn't_ to add a "memory" clobber, because it's
> not guaranteed to work, as it's explicitly defined to be a read/write
> _memory_ barrier, despite what kernel writers wish it would do.

It doesn't work because *other* code can change the fp environment as
well; adding "memory" clobbers to both asms will keep them in order (if
they both are not deleted, etc).  This hinders optimisation very
seriously for code like this btw, another important reason to not do it
here.

And what is "memory" here anyway?  New memory items can be added very
late in the pass pipeline, and anything done in an earlier pass will
not have considered those of course.  The most important case of this
(but not the only case) is new slots on the stack.

> Your assertion that compilers don't know about control registers isn't
> exactly true. In the case of "pushf/popf", those instructions know
> about the eflags registers. All subsequent instructions that read or
> modify eflags also know about it. In essence, the compiler can
> determine its own clobber list, which include MSRs.

The compiler only knows about some bits in the flags register.  The
compiler does not know about most machine resources, including all of
the model-specific registers.


Segher

  parent reply	other threads:[~2022-03-17 22:45 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20211215211847.206208-1-morbo@google.com>
     [not found] ` <87mtl1l63m.ffs@tglx>
2021-12-16 19:55   ` [PATCH] x86: use builtins to read eflags Bill Wendling
2021-12-17 12:48     ` Peter Zijlstra
2021-12-17 19:39     ` Thomas Gleixner
2022-03-14 23:09     ` H. Peter Anvin
2022-03-15  0:08       ` Bill Wendling
     [not found] ` <20211215232605.GJ16608@worktop.programming.kicks-ass.net>
2021-12-16 20:00   ` Bill Wendling
2021-12-16 20:07     ` Nick Desaulniers
2021-12-29  2:12 ` [PATCH v2] " Bill Wendling
2022-01-27 20:56   ` Bill Wendling
2022-02-04  0:16   ` Thomas Gleixner
2022-02-04  0:58     ` Bill Wendling
2022-02-04  0:57   ` [PATCH v3] " Bill Wendling
2022-02-07 22:11     ` Nick Desaulniers
2022-02-08  9:14       ` David Laight
2022-02-08 23:18         ` Bill Wendling
2022-02-14 23:53         ` Nick Desaulniers
2022-02-10 22:31     ` [PATCH v4] " Bill Wendling
2022-02-11 16:40       ` David Laight
2022-02-11 19:25         ` Bill Wendling
2022-02-11 22:09           ` David Laight
2022-02-11 23:33             ` Bill Wendling
2022-02-12  0:24           ` Nick Desaulniers
2022-02-12  9:23             ` Bill Wendling
2022-02-15  0:33               ` Nick Desaulniers
2022-03-01 20:19       ` [PATCH v5] " Bill Wendling
2022-03-14 23:07         ` Bill Wendling
     [not found]           ` <AC3D873E-A28B-41F1-8BF4-2F6F37BCEEB4@zytor.com>
2022-03-15  7:19             ` Bill Wendling
2022-03-17 15:43               ` H. Peter Anvin
2022-03-17 18:00                 ` Nick Desaulniers
2022-03-17 18:52                   ` Linus Torvalds
2022-03-17 19:45                     ` Bill Wendling
2022-03-17 20:13                       ` Linus Torvalds
2022-03-17 21:10                         ` Bill Wendling
2022-03-17 21:21                           ` Linus Torvalds
2022-03-17 21:45                             ` Bill Wendling
2022-03-17 22:51                               ` Linus Torvalds
2022-03-17 23:14                                 ` Linus Torvalds
2022-03-17 23:19                                 ` Segher Boessenkool
2022-03-17 23:31                                   ` Linus Torvalds
2022-03-18  0:05                                     ` Segher Boessenkool
2022-03-17 22:37                       ` Segher Boessenkool [this message]
2022-03-17 20:13                     ` Florian Weimer
2022-03-17 20:36                       ` Linus Torvalds
2022-03-18  0:25                         ` Segher Boessenkool
2022-03-18  1:21                           ` Linus Torvalds
2022-03-18  1:50                             ` Linus Torvalds
2022-03-17 21:05                     ` Andrew Cooper
2022-03-17 21:39                       ` Linus Torvalds
2022-03-18 17:59                         ` Andy Lutomirski
2022-03-18 18:19                           ` Linus Torvalds
2022-03-18 21:48                             ` Andrew Cooper
2022-03-18 23:10                               ` Linus Torvalds
2022-03-18 23:42                                 ` Segher Boessenkool
2022-03-19  1:13                                   ` Linus Torvalds
2022-03-19 23:15                                   ` Andy Lutomirski
2022-03-18 22:09                             ` Segher Boessenkool
2022-03-18 22:33                               ` H. Peter Anvin
2022-03-18 22:36                               ` David Laight
2022-03-18 22:47                                 ` H. Peter Anvin
2022-03-18 22:43                             ` David Laight
2022-03-18 23:03                               ` H. Peter Anvin
2022-03-18 23:04                         ` Segher Boessenkool
2022-03-18 23:52                           ` David Laight

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220317223756.GM614@gate.crashing.org \
    --to=segher@kernel.crashing.org \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=jgross@suse.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-toolchains@vger.kernel.org \
    --cc=llvm@lists.linux.dev \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=morbo@google.com \
    --cc=nathan@kernel.org \
    --cc=ndesaulniers@google.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).